Use Nightmare.js to Automate Headless Browsing

Updated by Linode Contributed by Nashruddin Amin

Contribute on GitHub

View Project | View File | Edit File

This is a Linode Community guide. If you’re an expert on something for which we need a guide, you too can get paid to write for us.


Nightmare.js is a high-level browser automation library, designed to automate browsing tasks for sites that don’t have APIs. The library itself is a wrapper around Electron, which Nightmare.js uses as a browser to interact with web sites. This guide helps you install Nightmare.js on Ubuntu 16.04 and run automation scripts without the need for a graphical user interface.

Before You Begin

  1. Familiarize yourself with our Getting Started guide and complete the steps for setting your Linode’s hostname and timezone.

  2. This guide will use sudo wherever possible. Complete the sections of our Securing Your Server to create a standard user account, harden SSH access and remove unnecessary network services.

  3. Update your system:

    1
    sudo apt-get update && sudo apt-get upgrade
    

This guide is written for a non-root user. Commands that require elevated privileges are prefixed with sudo. If you’re not familiar with the sudo command, see the Users and Groups guide.

Install Node.js

The Ubuntu 16.04 repository is slower to release recent versions of Node.js. Install the most recent available version through the NodeSource PPA (formerly Chris Lea’s Launchpad PPA).

  1. Install the NodeSource PPA:

    1
    curl -sL https://deb.nodesource.com/setup_6.x | sudo -E bash -
    

    This command fetches the latest version of Node.js 6. To install a specific version, replace the 6.x in this example.

  2. Install Node.js and NPM with the following command:

    1
    sudo apt-get install -y nodejs
    
  3. Confirm that Node.js is successfully installed:

    1
    node --version
    
  4. Check that the NPM command-line tool is successfully installed as well:

    1
     npm --version
    

Install Nightmare.js

To avoid installing the Node packages for the system globally, install Nightmare.js in a specific directory. This examples creates a automation directory within the current user’s home directory as the base the project.

  1. Create and switch to the automation directory:

    1
    mkdir ~/automation && cd ~/automation
    
  2. Initialize an NPM project. NPM prompts you to provide a name, repository, and other details for the project. Accept the default values or assign whatever names your want. To accept the defaults automatically, add the -f force flag to this example:

    1
    npm init
    
  3. Install Nightmare.js:

    1
    npm install --save nightmare
    

Create and Run the Automation Script

Nightmare.js is an NPM module, so it can be imported from within a Node.js script. Use these examples to write a simple script that will search Linode’s documentation for guides about Ubuntu.

  1. Nightmare.js uses the Electron browser and requires an X server. Install xvfb and its dependencies so that you can run graphical applications without display hardware:

    1
     sudo apt-get install -y xvfb x11-xkb-utils xfonts-100dpi xfonts-75dpi xfonts-scalable xfonts-cyrillic x11-apps clang libdbus-1-dev libgtk2.0-dev libnotify-dev libgnome-keyring-dev libgconf2-dev libasound2-dev libcap-dev libcups2-dev libxtst-dev libxss1 libnss3-dev gcc-multilib g++-multilib
    
  2. Create linode.js inside the automation directory and add the following:

    ~/automation/linode.js
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    const Nightmare = require('nightmare');
    const nightmare = Nightmare({show: true});
    
    
    nightmare
        .goto('https://www.linode.com/docs')
        .insert('#gsc-i-id1', 'ubuntu')
        .click('input.gsc-search-button-v2')
        .wait('#search-results')
        .evaluate(function() {
                let searchResults = [];
    
                const results =  document.querySelectorAll('h6.library-search-result-title a');
                results.forEach(function(result) {
                        let row = {
                                        'title':result.innerText,
                                        'url':result.href
                                  }
                        searchResults.push(row);
                });
                return searchResults;
        })
        .end()
        .then(function(result) {
                result.forEach(function(r) {
                        console.log('Title: ' + r.title);
                        console.log('URL: ' + r.url);
                })
        })
        .catch(function(e)  {
                console.log(e);
        });
    
  3. Run the script:

    1
    xvfb-run node linode.js
    

    The script visits the Linode docs page, enters ‘ubuntu’ into the input box, and clicks the submit button. It then waits for the results to load and prints the url and title each entry on the first page of results.

    The output will resemble the following:

    1
    2
    3
    4
    5
    6
    7
    Title: How to Install a LAMP Stack on Ubuntu 16.04
    URL: https://www.linode.com/docs/web-servers/lamp/install-lamp-stack-on-ubuntu-16-04
    Title: Install and Configure MySQL Workbench on Ubuntu 16.04
    URL: https://www.linode.com/docs/databases/mysql/install-and-configure-mysql-workbench-on-ubuntu
    Title: Install MongoDB on Ubuntu 16.04 (Xenial)
    URL: https://www.linode.com/docs/databases/mongodb/install-mongodb-on-ubuntu-16-04
    ...
    

Add a Cron Job to Run the Automation Script

This example automates the script to run once every hour. It changes to the ~/automation/ directory, runs the scraping script, and saves the output to a file with a unique filename that includes the date and time it ran.

For more information about using Cron, see our Schedule Tasks with Cron guide.

  1. Open the crontab file:

    1
    crontab -e
    
  2. Add the following line to the end of the file:

    crontab
    1
    0 * * * * cd ~/automation && xvfb-run node linode.js >> data_$(date +\%Y_\%m_\%d_\%I_\%M_\%p).txt
    

More Information

You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.

This guide is published under a CC BY-ND 4.0 license.