NJSParser
A powerful parser and explorer for any website built with NextJS.
- Parses flight data (from the
self.__next_f.push
scripts).
- Parses next data from
__NEXT_DATA__
script.
- Parses build manifests.
- Searches for build id.
- Many other things ...
It uses only lxml, orjson, pydantic to garantee a fast and efficient data parsing and processing.
Installation:
pip install njsparser
Use
CLI
You can use the cli from 3 different commands:
njsp
njsparser
python3 -m njsparser.cli
It has only one functionality of displaying informations about the website, like this:
For more informations, use the --help
argument with the command.
Parsing __next_f
.
The data you find in __next_f
is called flight data, and contains data under react format. You can parse it easily with njsparser
the way it follows.
We will build a parser for the flight data example
- In the website you want to parse, make sure you see the
self.__next_f.push
in the begining of script contained the data you search for. Here I am searching for the description "I should really have a better hobby, but this is it..."
(in blue) in my page, and I can also see the self.__next_f.push
(in green). 
- Then I will do this simple script, to parse, then dump the flight data of my website, and see what objects I am searching for:
import requests
import njsparser
import json
response = requests.get("https://mediux.pro/user/r3draid3r04").text
fd = njsparser.BeautifulFD(response)
with open("fd.json", "w") as write:
json.dump(fd, write, indent=4, default=njsparser.default)
- In my dumped flight data, I will search for the same string:

- Then I will do to the closed
"value"
root to my found string, and look at the value of "cls"
. Here it is "Data"
: 
- Now that I know the
"cls"
(class) of object my data is contained in, I can search for it in my BeautifulFD
object:
import requests
import njsparser
import json
response = requests.get("https://mediux.pro/user/r3draid3r04").text
fd = njsparser.BeautifulFD(response)
for data in fd.find_iter([njsparser.T.Data]):
if data.content is not None and "user" in data.content:
break
else:
raise ValueError
user = data.content["user"]
print(user["tagline"])
More informations:
- If your object is inside another object (e.g.
"Data"
in a "DataParent"
, or in a "DataContainer"
), the .find_iter
will also find it recursively (except if you set recursive=False
).
- Make sure you use the correct flight data classes attributes when fetching their data. The class
"Data"
has a .content
attribute. If you use .value
, you will end up with the raw value and will have to parse it yourself. If you work with a "DataParent"
object, instead of using .value
(that will give you ["$", "$L16", None, {"children": ["$", "$L17", None, {"profile": {}}]}])
, use .children
(that will give you a "Data"
object with a .content
of {"profile": {}}
). Check for the type file to see what classes you're interested in, and their attributes.
- You can also use
.find
on BeautifulFD
to return the only first occurence of your query, or None if not found.
Parsing <script id='__NEXT_DATA__'>
Just do:
import njsparser
html_text = ...
data = njsparser.get_next_data(html_text)
If the page contains any script <script id='__NEXT_DATA__'>
, it will return the json loaded data, otherwise will return None
.