What is webdriver?
The 'webdriver' npm package is an HTTP client for interacting with WebDriver-compatible browsers for the purpose of performing automated web application testing. It allows you to control a browser programmatically and perform actions like navigating to URLs, interacting with web page elements, and executing JavaScript within the context of the browser.
What are webdriver's main functionalities?
Browser Navigation
This code sample demonstrates how to navigate to a URL using the webdriver package. It initializes a new browser session, navigates to 'https://example.com', and then closes the session.
const { remote } = require('webdriverio');
(async () => {
const browser = await remote({
capabilities: { browserName: 'chrome' }
});
await browser.url('https://example.com');
await browser.deleteSession();
})();
Element Interaction
This code sample shows how to interact with web page elements. It finds an input field and a button by their selectors, sets a value in the input field, and clicks the button.
const { remote } = require('webdriverio');
(async () => {
const browser = await remote({
capabilities: { browserName: 'chrome' }
});
await browser.url('https://example.com');
const input = await browser.$('input#search');
await input.setValue('WebdriverIO');
const button = await browser.$('button#submit');
await button.click();
await browser.deleteSession();
})();
Executing JavaScript
This code sample illustrates executing custom JavaScript code in the context of the browser. It retrieves and logs the title of the current web page.
const { remote } = require('webdriverio');
(async () => {
const browser = await remote({
capabilities: { browserName: 'chrome' }
});
await browser.url('https://example.com');
const result = await browser.execute(() => {
return document.title;
});
console.log('Document title is: ' + result);
await browser.deleteSession();
})();
Other packages similar to webdriver
selenium-webdriver
Selenium WebDriver is a well-known library for browser automation. It provides bindings for multiple programming languages and is the basis for many browser automation tools. Compared to 'webdriver', Selenium WebDriver is more established with a larger community but can be more complex to set up and use.
puppeteer
Puppeteer is a Node library developed by the Chrome DevTools team. It provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer is known for its ease of use and rich features for browser automation and scraping, especially with Chrome-based browsers. Unlike 'webdriver', which is designed to work with multiple browsers, Puppeteer is optimized for Chrome/Chromium.
playwright
Playwright is a Node library that provides a set of APIs to automate Chromium, Firefox, and WebKit browsers. It is similar to Puppeteer but with cross-browser support. Playwright offers features like auto-wait, network interception, and emulation capabilities. It is considered to be more modern and feature-rich compared to 'webdriver' and is developed by the same team that initially built Puppeteer.
nightwatch
Nightwatch.js is an end-to-end testing solution for web applications and websites, written in Node.js. It uses the W3C WebDriver API for browser automation. Nightwatch has a built-in test runner and assertions library, making it a more integrated solution compared to 'webdriver', which is primarily a client for interacting with the WebDriver API.
cypress
Cypress is a front-end testing tool built for the modern web. It is both a library for writing tests and a test runner that executes the tests in a browser. Cypress provides a unique interactive test runner that allows you to see commands as they execute while also viewing the application under test. It differs from 'webdriver' in that it runs in the same run-loop as the application, enabling more consistent results and easier debugging.
WebDriver
A lightweight, non-opinionated implementation of the WebDriver and WebDriver BiDi specification including mobile commands supported by Appium
There are tons of Selenium and WebDriver binding implementations in the Node.js world. Every one of them have an opinionated API and recommended way to use. This binding is the most non-opinionated you will find as it just represents the WebDriver specification and doesn't come with any extra or higher level abstraction. It is lightweight and comes with support for the WebDriver specification and Appium's Mobile JSONWire Protocol.
The package supports the following protocols:
Commands are added to the clients protocol based on assumptions of provided capabilities. You can find more details about the commands by checking out the @wdio/protocols
package. All commands come with TypeScript support.
Install
To install this package from NPM run:
npm i webdriver
WebDriver Example
The following example demonstrates a simple Google Search scenario:
import WebDriver from 'webdriver';
const client = await WebDriver.newSession({
path: '/',
capabilities: { browserName: 'firefox' }
})
await client.navigateTo('https://www.google.com/ncr')
const searchInput = await client.findElement('css selector', '#lst-ib')
await client.elementSendKeys(searchInput['element-6066-11e4-a52e-4f735466cecf'], 'WebDriver')
const searchBtn = await client.findElement('css selector', 'input[value="Google Search"]')
await client.elementClick(searchBtn['element-6066-11e4-a52e-4f735466cecf'])
console.log(await client.getTitle())
await client.deleteSession()
WebDriver Bidi Example
To connect to the WebDriver Bidi protocol you have to send along a webSocketUrl
flag to tell the browser driver to opt-in to the protocol:
import WebDriver from 'webdriver'
const browser = await WebDriver.newSession({
capabilities: {
webSocketUrl: true,
browserName: 'firefox'
}
})
await browser.sessionSubscribe({ events: ['log.entryAdded'] })
browser.on('message', (data) => console.log('received %s', data))
await browser.executeScript('console.log("Hello Bidi")', [])
await browser.deleteSession()
Configuration
To create a WebDriver session call the newSession
method on the WebDriver
class and pass in your configurations:
import WebDriver from 'webdriver'
const client = await WebDriver.newSession(options)
The following options are available:
capabilities
Defines the capabilities you want to run in your Selenium session.
Type: Object
Required: true
logLevel
Level of logging verbosity.
Type: String
Default: info
Options: trace | debug | info | warn | error | silent
protocol
Protocol to use when communicating with the Selenium standalone server (or driver).
Type: String
Default: http
Options: http | https
hostname
Host of your WebDriver server.
Type: String
Default: localhost
port
Port your WebDriver server is on.
Type: Number
Default: 4444
path
Path to WebDriver endpoint or grid server.
Type: String
Default: /
queryParams
Query parameters that are propagated to the driver server.
Type: Object
Default: null
connectionRetryTimeout
Timeout for any WebDriver request to a driver or grid.
Type: Number
Default: 120000
connectionRetryCount
Count of request retries to the Selenium server.
Type: Number
Default: 2
agent
Allows you to use a custom http
/https
/http2
agent to make requests.
Type: Object
Default:
{
http: new http.Agent({ keepAlive: true }),
https: new https.Agent({ keepAlive: true })
}
transformRequest
Function intercepting HTTP request options before a WebDriver request is made
Type: (RequestOptions) => RequestOptions
Default: none
transformResponse
Function intercepting HTTP response objects after a WebDriver response has arrived
Type: (Response, RequestOptions) => Response
Default: none