Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
@rubensworks/process-sparql-benchmark-results
Advanced tools
Personal tools to post-process query results from SPARQL benchmarks
This package contains a set of tools to post-process query results from SPARQL benchmarks.
Be aware that this is primarily intended for my personal usage. So I may be making some assumptions here and there that may not be valuable for everyone. Furthermore, code in this package is minimally tested, and breaking changes may occur at any time.
Most of these tools work on output produced by jbr experiments.
$ npm install -g @rubensworks/sparql-benchmark-runner
or
$ yarn global add @rubensworks/sparql-benchmark-runner
After installing this package, the psbr
CLI tool will become available,
which consists of a number of sub-commands,
as explained hereafter.
Full usage:
psbr <command>
Commands:
psbr csv Creates CSV files
psbr stats Derive statistics from experiments
psbr tex Creates a LaTeX TikZ plot file
Options:
--version Show version number [boolean]
--cwd The current working directory [string] [default: .]
-v, --verbose If more logging output should be generated [boolean]
--help Show help
Subcommands of psbr tex
can create LaTeX TikZ files.
By invoking psbr tex query
with any number of experiment directories,
the plot_queries_data.tex
and plot_queries_data.csv
files will be created.
This TeX files contains a TikZ
figure, which can be used to generate a vector plot from the CSV data.
By default, it will look for the query-times.csv
file within each experiment directory.
This file is expected to look as follows:
name;id;results;time;timestamps
interactive-short-4;0;0;4;
interactive-short-4;1;0;1;
interactive-short-5;0;0;0;
interactive-short-5;1;0;0;
SVG output:
Optionally, an SVG file can also be created using the --svg
flag.
Override labels:
If you want to override the labels of experiments or queries, you can do this as follows:
$ psbr tex query combination_* --overrideCombinationLabels 'a,b,c,d' --overrideQueryLabels 'Q1,Q2'
Full usage:
psbr tex query <experiment-dir...>
Plot the query execution times from the given experiments
Options:
--version Show version number [boolean]
--cwd The current working directory
[string] [default: .]
-v, --verbose If more logging output should be generated
[boolean]
--help Show help [boolean]
-q, --queryRegex Regex for queries to include (before any
label overrides). Examples: '^C', '^[^C]',
... [string]
-n, --name Custom output file name
[string] [default: "plot_queries_data"]
-c, --color Color scheme name from colorbrewer2.org
[string]
--maxY The upper limit of the Y-axis. Defaults to
maximum Y value [number]
--legend If a legend should be included
[boolean] [default: true]
--legendPos The legend position X,Y (anchor north-east)
[string] [default: "1.0,1.0"]
--logY If the Y-Axis must have a log scale
[boolean] [default: false]
--inputName Custom input file name per experiment
[string] [default: "query-times.csv"]
--inputDelimiter Delimiter for the input CSV file
[string] [default: ";"]
--overrideCombinationLabels Comma-separated list of combination labels to
use [string]
--overrideQueryLabels Comma-separated list of query labels to use
[string]
--svg If the tex file should be converted to svg
via the tex2svg command
[boolean] [default: false]
This command will create a vectorial CSV-based (LaTeX/TiKZ) plot that compares compares the query result arrival times over all given experiments. This is useful for comparing the impact of different approaches on query result arrival times.
By invoking psbr tex queryTimes
with any number of experiment directories,
the query_times.tex
and query_times.csv
files will be created.
This TeX files contains a TikZ
figure, which can be used to generate a vector plot from the CSV data.
By default, it will look for the query-times.csv
file within each experiment directory.
This file is expected to look as follows:
name;id;results;time;timestamps
interactive-short-4;0;0;4;
interactive-short-4;1;0;1;
interactive-short-5;0;0;0;
interactive-short-5;1;0;0;
Optionally, another index of the query can be selected by suffixing the query name with .index
, such as psbr tex queryTimes L2.3
.
Full usage:
psbr tex queryTimes <query> <experiment-dir...>
Plot the query result arrival times from the given experiments
Options:
--version Show version number [boolean]
--cwd The current working directory
[string] [default: .]
-v, --verbose If more logging output should be generated
[boolean]
--help Show help [boolean]
-n, --name Custom output file name
[string] [default: "query_times"]
-c, --color Color scheme name from colorbrewer2.org
[string]
--maxY The upper limit of the Y-axis. Defaults to
maximum Y value [number]
--legend If a legend should be included
[boolean] [default: true]
--legendPos The legend position X,Y (anchor north-east)
[string] [default: "1.0,1.0"]
--logY If the Y-Axis must have a log scale
[boolean] [default: false]
--inputName Custom input file name per experiment
[string] [default: "query-times.csv"]
--inputDelimiter Delimiter for the input CSV file
[string] [default: ";"]
--overrideCombinationLabels Comma-separated list of combination labels to
use [string]
--svg If the tex file should be converted to svg
via the tex2svg command
[boolean] [default: false]
Subcommands of psbr csv
can create CSV files.
By invoking psbr csv query
with any number of experiment directories,
the data_all.csv
file will be created.
By default, it will look for the query-times.csv
file within each experiment directory.
This file is expected to look as follows:
name;id;results;time;timestamps
interactive-short-4;0;0;4;
interactive-short-4;1;0;1;
interactive-short-5;0;0;0;
interactive-short-5;1;0;0;
Concretely, it will output the data_all.csv
that looks as follows:
combination;time
output/combination_0;10963
output/combination_0;10849
output/combination_0;11912
output/combination_1;16320
output/combination_1;12389
output/combination_1;11944
You can for example use this data to calculate the statistical different between two combinations in R as follows:
data <- read.csv('./data_all.csv', sep = ';')
# Calculate means
aggregate(data$time, list(data$combination), median)
# Compare means with Kruskal-Wallis test (nonparametric, if non-normal distribution)
kruskal.test(time ~ combination, data = data[which(data$combination=='output/combination_0' | data$combination=='output/combination_1'),])
# If p < 0.05, combinations have no difference with a significance of 95%.
# If p > 0.05, combinations are different with a significance of 95%.
Full usage:
psbr csv query <experiment-dir...>
Summarize all query execution times from the given experiments
Options:
--version Show version number [boolean]
--cwd The current working directory [string] [default: .]
-v, --verbose If more logging output should be generated [boolean]
--help Show help [boolean]
-q, --queryRegex Regex for queries to include (before any label
overrides). Examples: '^C', '^[^C]', ... [string]
-n, --name Custom output file name
[string] [default: "data_all.csv"]
--inputName Custom input file name per experiment
[string] [default: "query-times.csv"]
--inputDelimiter Delimiter for the input CSV file [string] [default: ";"]
Subcommands of psbr stats
can derive statistics.
By invoking psbr stats docker
with a Docker experimental result file,
a summary of all its contents will be created.
For instance, if a Docker-based experiment may produce a file such as stats-server.csv
,
which is expected to look as follows:
cpu_percentage,memory,memory_percentage,received,transmitted
0.012520939947780679,122277888,5.856675939622464,882,0
0,122228736,5.8543217336372875,882,0
Based on this, the following summary will be printed
CPU: 0.01 %
Memory relative: 116.59 MB
Memory absolute: 5.86 %
Received: 0.00 MB
Transmitted: 0.00 MB
Full usage:
psbr stats docker <docker-csv-file>
Show the stats of an Docker CSV file from an experiment
Options:
--version Show version number [boolean]
--cwd The current working directory [string] [default: .]
-v, --verbose If more logging output should be generated [boolean]
--help Show help [boolean]
-d, --digits The precision of output numbers [number] [default: 2]
This code is copyrighted by Ghent University – imec and released under the MIT license.
FAQs
Personal tools to post-process query results from SPARQL benchmarks
We found that @rubensworks/process-sparql-benchmark-results demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.