This NodeJS library scrapes the comments of the YouTube provided HTML comment data without any API usage order by date descending (so most recent first). It is developed for and tailored towards easy usage with FreeTube but can be used with any other project as well.
This library does not require any API keys, with the attached maximum quotas, but instead might take longer to receive the required data.
The library works as long as YouTube keeps its web page layout the same. Therefore, there is no guarantee that this library will work at all times.
If this library should not work at some point, please create an issue and let me know so that I can take a look into it. Pull requests are also welcomed in this case.
Installation
npm install @freetube/yt-comment-scraper --save
Usage
Set your instance with the following syntax. Use the second line instead if you're using modules / Typescript
const ytcm = require("@freetube/yt-comment-scraper")
import ytcm from '@freetube/yt-comment-scraper'
getComments(payload)
Returns a list of objects containing comments from the next page of the video.
const payload = {
videoId: videoId,
sortByNewest: sortByNewest,
continuation: continuation,
mustSetCookie: false,
httpsAgent: agent
}
ytcm.getComments(payload).then((data) =>{
console.log(data);
}).catch((error)=>{
console.log(error);
});
Returned Data
The data is returned as an object with a list of comment objects and a continuation token (if more comments exist)
{
comments: [
{
commentId: String,
authorId: String,
author: String,
authorThumb: Array [
{
width: Number,
height: Number,
url: String
}
],
edited: Boolean,
text: String,
likes: String,
time: String,
numReplies: Number,
isOwner: Boolean,
isHearted: Boolean,
isPinned: Boolean,
isVerified: Boolean,
isOfficialArtist: Boolean,
hasOwnerReplied: Boolean,
isMember: Boolean,
memberIconUrl: String | null,
customEmojis: Array [
{
text: String,
emojiThumbnails: Array [
{
width: Number,
height: Number,
url: String
}
]
}
]
replyToken: String
}
],
continuation: String | null
}
getCommentReplies(payload)
Returns a list of objects containing replies from a given comment.
- payload (Object) (Required) - An object containing the various options
- videoId (String) (Required) - The video ID to grab comments from
- replyToken (String) (Required) - The reply token from a comment object of
getComments()
or the continuation string from a previous call to getCommentReplies()
- mustSetCookie (Boolean) (Optional) - The flag should be set to true when cookies are not handled by your application (e.g. Electron) already
- httpsAgent (Object) (Optional) - Allows to specify all kind of different agent data (see NodeJS documentation or 3rd party packages like node-https-proxy-agent for options like proxies)
const parameters = {videoId: 'someId', replyToken: 'HSDcjasgdajwSdhAsd', mustSetCookie: true, httpsAgent: null};
ytcm.getCommentReplies(parameters).then((data) =>{
console.log(data);
}).catch((error)=>{
console.log(error);
});
Returned Data
The data is returned as a list of objects (seen below).
comments: [
{
commentId: String,
authorId: String,
author: String,
authorThumb: Array [
{
width: Number,
height: Number,
url: String
}
],
edited: Boolean,
text: String,
likes: String,
time: String,
numReplies: Number,
isOwner: Boolean,
isHearted: Boolean,
isPinned: false,
isVerified: Boolean,
isOfficialArtist: Boolean,
hasOwnerReplied: false,
isMember: Boolean,
memberIconUrl: String | null,
customEmojis: Array [
{
text: String,
emojiThumbnails: Array [
{
width: Number,
height: Number,
url: String
}
]
}
]
replyToken: null
}],
continuation: String | null
Credits
Thanks to egbertbouman for his/her Python project which guided this project through the difficult HTTP calls.