mwn
mwn is a modern MediaWiki bot framework in NodeJS, orginally adapted from mwbot.
Development status: Unstable. Versioning: while mwn is in version 0, changes may be made to the public interface with a change in the minor version number.
Documentation given below is incomplete. There are a number of additional classes such as bot.title
, bot.wikitext
, bot.page
, etc that provide useful functionality but aren't documented.
Amongst the major highlights are batchOperation
and seriesBatchOperation
which allow you run a large number of tasks with control over concurrency and sleep time between tasks. Failing actions can be automatically retried.
Setup
To install, run npm install mwn
.
Or obtain the latest development copy:
git clone https://github.com/siddharthvp/mwn.git
cd mwn
npm install
mwn uses JSON with formatversion 2 by default; formatversion 2 is an improved JSON output format introduced in MediaWiki in 2015.
Node version
mwn is written with Node.js 13 in hand. While everything may still work in older versions of Node, you can consider upgrading to Node.js 13. If your bot is hosted on Toolforge, you can install the latest node.js in your home directory, using:
npm install npm@latest
npm install n
export N_PREFIX=~
./node_modules/n/bin/n latest
export PATH=~/bin:$PATH
Check that your .profile
or .bashrc
file includes the line PATH="$HOME/bin:$PATH"
, so that the path includes your home directory every time you open the shell.
Set up a bot password
To be able to login to the wiki, you have to set up a bot password using the wiki's Special:BotPasswords page.
If you're migrating from mwbot, note that:
edit
in mwbot is different from edit
in mwn. You want to use save
instead.- If you were using the default formatversion=1 output format, set formatversion: 1 in the config options.
Documentation
Create a new bot instance:
const bot = new mwn();
Log in to the bot:
bot.login({
apiUrl: 'https://en.wikipedia.org/w/api.php',
username: 'YourBotUsername',
password: 'YourBotPassword'
});
Set default parameters to be sent to be included in every API request:
bot.setDefaultParams({
assert: 'bot',
maxlag: 4
});
Set bot options. The default values for each is specified below:
bot.setOptions({
silent: false,
maxlagPause: 5000,
maxlagMaxRetries: 3,
apiUrl: null
});
Maxlag: The default maxlag parameter used by mwn is 5 seconds. Requests failing due to maxlag will be automatically retried after pausing for a duration specified by maxlagPause
(default 5 seconds). A maximum of maxlagMaxRetries
will take place (default 3).
Fetch an CSRF token required for most write operations.
bot.getCsrfToken();
The token, once obtained is stored in the bot state so that it can be reused any number of times.
If an action fails due to an expired or missing token, the action will be automatically retried after fetching a new token.
For convenience, you can log in and get the edit token together as:
bot.loginGetToken();
If your bot doesn't need to log in, you can simply set the API url using:
bot.setApiUrl('https://en.wikipedia.org/w/api.php');
Set your user agent (required for WMF wikis):
bot.setUserAgent('myCoolToolName v1.0 ([[w:en:User:Example]])/mwn');
Edit a page. Edit conflicts are raised as errors.
bot.edit('Page title', rev => {
var text = rev.content.replace(/foo/g, 'bar');
return {
text: text,
summary: 'replacing foo with bar',
minor: true
};
});
Save a page with the given content without loading it first. Simpler verion of edit
. Does not offer any edit conflict detection.
bot.save('Page title', 'Page content', 'Edit summary');
Create a new page.
bot.create('Page title', 'Page content', 'Edit summary');
Post a new section to a talk page:
bot.newSection('Page title', 'New section header', 'Section content', additionalOptions);
Read the contents of a page:
bot.read('Page title');
Read a page along with metadata:
bot.read('Page title', {
rvprop: ['content', 'timestamp', 'user', 'comment']
});
Read multiple pages using a single API call:
bot.read(['Page 1', 'Page 2', 'Page 3']).then(pages => {
});
Delete a page:
bot.delete('Page title', 'deletion log summary', additionalOptions);
Restore all deleted versions:
bot.undelete('Page title', 'log summary', additionalOptions);
Move a page along with its subpages:
bot.move('Old page title', 'New page title', 'move summary', {
movesubpages: true,
movetalk: true
});
Parse wikitext (see API:Parse for additionalOptions)
bot.parseWikitext('Input wikitext', additonalOptions);
Parse the contents of a given page
bot.parseTitle('Page name', additionalOptions);
Rollback a user:
bot.rollback('Page title', 'user', additionalOptions);
Upload a file from your system to the wiki:
bot.upload('File title', '/path/to/file', 'comment', customParams);
Direct calls
request(query)
Directly query the API. See mw:API for options. You can create and test your queries in the API sandbox.
Example: get all images used on the article Foo
bot.request({
"action": "query",
"prop": "images",
"titles": "Foo"
}).then(data => {
return data.query.pages[0].images.map(im => im.title);
});
Bulk processing methods
continuedQuery(query, maxCallsLimit)
Send an API query, and continue re-sending it with the continue parameters received in the response, until there are no more results (or till maxCalls
limit is reached). The return value is a promise resolved with the array of responses to individual API calls.
bot.continousQuery(apiQueryObject, maxCalls=10)
Example: get a list of all active users on the wiki using continuedQuery
(using API:Allusers):
bot.continuedQuery({
"action": "query",
"list": "allusers",
"auactiveusers": 1,
"aulimit": "max"
}, 40).then(jsons => {
return jsons.reduce((activeusers, json) => {
return activeusers.concat(json.query.allusers.map(user => user.name));
}, []);
});
massQuery(query, nameOfBatchField, hasApiHighLimit)
MediaWiki sets a limit of 500 (50 for non-bots) on the number of pages that can be queried in a single API call. To query more than that, the massQuery
function can be used, which splits the page list into batches of 500 and sends individual queries and returns a promise resolved with the array of all individual API call responses.
Example: get the protection status of a large number of pages:
bot.massQuery({
"action": "query",
"format": "json",
"prop": "info",
"titles": ['Page1', 'Page2', ... , 'Page1300'],
"inprop": "protection"
})
.then(jsons => {
});
The 3rd parameter hasApiHighLimit
is set true
by default. If you get the API error 'toomanyvalues' (or similar), your account doesn't have the required user right, so set the parameter as false
.
Any errors in the individual API calls will not cause the entire massQuery to fail, but the data at the array index corresponding to that API call will be error object.
batchOperation(pageList, workerFunction, concurrency)
Perform asynchronous tasks (involving API usage) over a number of pages (or other arbitrary items). batchOperation
uses a default concurrency of 5. Customise this according to how expensive the API operation is. Higher concurrency limits could lead to more frequent API errors.
The workerFunction
must return a promise.
bot.batchOperation(pageList, (page, idx) => {
}, 5, 2);
seriesBatchOperation(pageList, workerFunction, sleepDuration)
Perform asynchronous tasks (involving API usage) over a number of pages one after the other, with a sleep duration between each task (default 5 seconds)
The workerFunction
must return a promise.
bot.seriesBatchOperation(pageList, (page, idx) => {
}, 5000, 2);
Note that seriesBatchOperation
with delay=0 is same as batchOperation
with concurrency=1.