
Security Fundamentals
Turtles, Clams, and Cyber Threat Actors: Shell Usage
The Socket Threat Research Team uncovers how threat actors weaponize shell techniques across npm, PyPI, and Go ecosystems to maintain persistence and exfiltrate data.
A powerful parser and explorer for any website built with NextJS.
self.__next_f.push
scripts).__NEXT_DATA__
script.It uses only lxml, orjson, pydantic to garantee a fast and efficient data parsing and processing.
pip install njsparser
You can use the cli from 3 different commands:
njsp
njsparser
python3 -m njsparser.cli
It has only one functionality of displaying informations about the website, like this:
--help
argument with the command.__next_f
.The data you find in __next_f
is called flight data, and contains data under react format. You can parse it easily with njsparser
the way it follows.
We will build a parser for the flight data example
self.__next_f.push
in the begining of script contained the data you search for. Here I am searching for the description "I should really have a better hobby, but this is it..."
(in blue) in my page, and I can also see the self.__next_f.push
(in green). import requests
import njsparser
import json
# Here I get my page's html
response = requests.get("https://mediux.pro/user/r3draid3r04").text
# Then I parse it with njsparser
fd = njsparser.BeautifulFD(response)
# Then I will write to json the content of the flight data
with open("fd.json", "w") as write:
# I use the njsparser.default function to support the dump of the flight data objects.
json.dump(fd, write, indent=4, default=njsparser.default)
"value"
root to my found string, and look at the value of "cls"
. Here it is "Data"
: "cls"
(class) of object my data is contained in, I can search for it in my BeautifulFD
object:
import requests
import njsparser
import json
# Here I get my page's html
response = requests.get("https://mediux.pro/user/r3draid3r04").text
# Then I parse it with njsparser
fd = njsparser.BeautifulFD(response)
# Then I iterate over the different classes `Data` in my flight data.
for data in fd.find_iter([njsparser.T.Data]):
# Then I make sure that the content of my data is not None, and
# check if the key `"user"` is in the data's content. If it is,
# then i break the loop of searching.
if data.content is not None and "user" in data.content:
break
else:
# If i didn't find it, i raise an error
raise ValueError
# Now i have the data of my user
user = data.content["user"]
# And I can print the string i was searching for before
print(user["tagline"])
More informations:
"Data"
in a "DataParent"
, or in a "DataContainer"
), the .find_iter
will also find it recursively (except if you set recursive=False
)."Data"
has a .content
attribute. If you use .value
, you will end up with the raw value and will have to parse it yourself. If you work with a "DataParent"
object, instead of using .value
(that will give you ["$", "$L16", None, {"children": ["$", "$L17", None, {"profile": {}}]}])
, use .children
(that will give you a "Data"
object with a .content
of {"profile": {}}
). Check for the type file to see what classes you're interested in, and their attributes..find
on BeautifulFD
to return the only first occurence of your query, or None if not found.<script id='__NEXT_DATA__'>
Just do:
import njsparser
html_text = ...
data = njsparser.get_next_data(html_text)
If the page contains any script <script id='__NEXT_DATA__'>
, it will return the json loaded data, otherwise will return None
.
FAQs
A Python NextJS data parser from HTML
We found that njsparser demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security Fundamentals
The Socket Threat Research Team uncovers how threat actors weaponize shell techniques across npm, PyPI, and Go ecosystems to maintain persistence and exfiltrate data.
Security News
At VulnCon 2025, NIST scrapped its NVD consortium plans, admitted it can't keep up with CVEs, and outlined automation efforts amid a mounting backlog.
Product
We redesigned our GitHub PR comments to deliver clear, actionable security insights without adding noise to your workflow.