
Security News
vlt Launches "reproduce": A New Tool Challenging the Limits of Package Provenance
vlt's new "reproduce" tool verifies npm packages against their source code, outperforming traditional provenance adoption in the JavaScript ecosystem.
W3C/WHATWG spec dependencies exploration companion. Features a short set of tools to study spec references as well as WebIDL term definitions and references found in W3C specifications.
Reffy is your W3C spec dependencies exploration companion. It features a short set of tools to study spec references as well as WebIDL term definitions and references found in W3C specifications.
See published reports for human-readable examples of reports generated by Reffy.
To launch the crawler and the report study tool, follow these steps:
git clone git@github.com:tidoust/reffy.git
npm install
config.json
file, initialized with { "w3cApiKey": [API key] }
npm run w3c
./TR/
, run npm run w3c-tr
.npm run whatwg
.Under the hoods, these commands run the following steps (and related commands) in turn:
node crawl-specs.js ./specs-w3c.json ./reports/w3c [tr]
. Add tr
to tell the crawler to load the latest published version of TR specifications instead of the latest Editor's Draft.node study-crawl.js ./reports/w3c/crawl.json [url]
. When the url
parameter is given, the resulting analysis will only contain the results for the spec at that URL (multiple URLs may be given as a comma-separated value list without spaces). You will probably want to redirect the output to a file, e.g. using node study-crawl.js ./reports/w3c/crawl.json > reports/w3c/study.json
.node generate-report.js ./reports/w3c/study.json [perspec|dep]
. By default, the tool generates a report per anomaly, pass perspec
to create a report per specification and dep
to generate a dependencies report. You will probably want to redirect the output to a file, e.g. using node generate-report.js ./reports/w3c/study.json > reports/w3c/index.md
.pandoc reports/w3c/index.md -f markdown -t html5 --section-divs -s --template report-template.html -o reports/w3c/index.html
(where report.md
is the Markdown report)node generate-report.js ./reports/w3c/crawl.json diff https://tidoust.github.io/reffy-reports/w3c/crawl.json
Some notes:
cache
subfolder in particular../
prefix is needed to point the crawler and study tools at local files for the time being (one of the many things to improve in the code!)Reffy's crawler takes an initial list of spec URLs as input and generates a machine-readable report with facts about each spec, including:
Reffy's report study tool takes the machine-readable report generated by the crawler, and creates a study report of potential anomalies found in the report. The study report can then easily be converted to a human-readable Markdown report. Reported potential anomalies are:
[]
instead of FrozenArray
);See the related WebIDLPedia project and its repo.
Some of the tools that compose Reffy may also be used directly.
The references parser takes the URL of a spec as input and generates a JSON structure that lists the normative and informative references found in the spec. To run the references parser: node parse-references.js [url]
The WebIDL extractor takes the URL of a spec as input and outputs the IDL definitions found in the spec as one block of text. To run the extractor: node extract-webidl.js [url]
The WebIDL parser takes the URL of a spec as input and generates a JSON structure that describes WebIDL term definitions and references that the spec contains. The parser uses WebIDL2 to parse the WebIDL content found in the spec. To run the WebIDL parser: node parse-webidl.js [url]
The Spec finder takes a JSON crawl report as input and checks a couple of sites that list Web specifications to detect new specifications that are not yet part of the crawl. To run the spec finder: node find-spec.js ./results.json
The crawl results merger merges a new JSON crawl report into a reference one. This tool is typically useful to replace the crawl results of a given specification with the results of a new run of the crawler on that specification. To run the crawl results merger: node merge-crawl-results.js [new crawl report] [reference crawl report] [crawl report to create]
The spec checker takes the URL of a spec, a reference crawl report and the name of the study report to create as inputs. It crawls and studies the given spec against the reference crawl report. Essentially, it applies the crawler, the merger and the study tool in order, to produces the anomalies report for the given spec. Note the URL can check multiple specs at once, provided the URLs are passed as a comma-separated value list without spaces. To run the spec checker: node check-specs.js [url] [reference crawl report] [study report to create]
For instance:
node parse-references.js https://w3c.github.io/presentation-api/
node extract-webidl.js https://www.w3.org/TR/webrtc/
node parse-webidl.js https://fetch.spec.whatwg.org/
node check-specs.js https://www.w3.org/TR/webstorage/ ./reports/w3c/crawl.json ./reports/study-webstorage.json
Reffy is still at an early stage of development. It may crash from time to time.
Reffy should be able to parse most of the W3C/WHATWG specifications that define WebIDL terms (both published versions and Editor's Drafts). The tool may work with other types of specs, but has not been tested with any of them.
The recommended lists appear in specs-w3c.json
and spec-whatwg.json
. Both files reference a common list in specs-common.json
. These lists were built out of the JavaScript APIs TR bucket, semi-manually completed to create a more comprehensive list.
It should be possible to crawl other specs, but note Reffy has not yet been tested with specs that do not define any WebIDL term, and would need to be adjusted to return "interesting" information. Feel free to try out other specs and report any issue!
Given the URL of a spec, the crawler basically goes through the following steps:
http(s)://www.w3.org/TR/[something]
, the crawler extracts the shortname of the specification, and sends a couple of requests to the W3C API to retrieve the URL of the Editor's Draft, or the URL of the latest published version if the URL of the Editor's Draft could not be found. This new URL replaces the given one.Window
object.The crawler processes 10 specifications at a time. Network and parsing errors should be reported in the crawl results.
The crawler reads parameters from the config.json
file. To be able to interact with the W3C API, that file must contain a w3cApiKey
entry whose value is a valid W3C API Key.
Optional parameters:
avoidNetworkRequests
: set this flag to true
to tell the crawler to use the cache entry for a URL directly, instead of sending a conditional HTTP request to check whether the entry is still valid. This parameter is typically useful when developing Reffy's code to work offline.resetCache
: set this flag to true
to tell the crawler to reset the contents of the local cache when it starts.Some rules or exceptions to the rule are hardcoded. In particular:
completeWithInfoFromW3CApi
method in crawl-specs.js
. The crawler loads the latest published version for these specs.completeWithInfoFromW3CApi
method in crawl-specs.js
.loadSpecificationFromHtml
function in util.js
.loadSpecificationFromHtml
function in util.js
. They may need to be extended to support other cases.addKnownVersions
in util.js
.Authors so far are François Daoust and Dominique Hazaël-Massieux.
Additional ideas, bugs and/or code contributions are most welcome. Create issues on GitHub as needed!
The code is available under an MIT license.
v11.0.0 - 2022-11-28
This new major version modifies and completes the CSS extraction logic. See #1117 for details.
No other change was made, meaning breaking and non-breaking changes only affect CSS extracts.
<
and >
because they are not
defined in specs with these characters (as opposed to types). Beware though,
references to functions in value syntax do use enclosing <
and >
characters.valuespaces
at the root level is now named values
. An array
is used there as well. The values
property lists both function
and type
definitions that are not namespaced to anything in particular (it used to also
contain namespaced definitions).selectors
property at the root level.values
property directly within the definition.values
of that definition.warnings
property at
the root of the extract. Four types of anomalies are reported:
<dfn>
(or when that <dfn>
does not have a
data-dfn-type
attribute that identifies a CSS construct)value
, function
or type
) for something and that something cannot be
found in the specfunction
, type
and value
definitions listed in a
values
property, definitions that appear in a values
property have a type
property.values
property. Non-namespaced values are not. For instance, <quote>
is not
listed as a value of the <content-list>
type, even though its value syntax
references it. This is to avoid duplicating constructs in the extracts.open-quote
is only listed as a value of <quote>
but neither as a
value of the <content-list>
type that references <quote>
nor as a value of
the content
property that references <content-list>
. This is also to avoid
duplicating constructs in the extracts.<custom-ident>
value
construct whose actual value is the <custom-ident>
type
construct
defined in CSS Values. Having both a namespaced value
and a non-namespaced
<type>
is somewhat common in CSS specs.FAQs
W3C/WHATWG spec dependencies exploration companion. Features a short set of tools to study spec references as well as WebIDL term definitions and references found in W3C specifications.
The npm package reffy receives a total of 440 weekly downloads. As such, reffy popularity was classified as not popular.
We found that reffy demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
vlt's new "reproduce" tool verifies npm packages against their source code, outperforming traditional provenance adoption in the JavaScript ecosystem.
Research
Security News
Socket researchers uncovered a malicious PyPI package exploiting Deezer’s API to enable coordinated music piracy through API abuse and C2 server control.
Research
The Socket Research Team discovered a malicious npm package, '@ton-wallet/create', stealing cryptocurrency wallet keys from developers and users in the TON ecosystem.