Research
Security News
Malicious npm Package Targets Solana Developers and Hijacks Funds
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
maplib is a knowledge graph construction library for building RDF knowledge graphs using template expansion (OTTR Templates). Maplib features SPARQL- and SHACL-engines that are available as the graph is being constructed, allowing enrichment and validation. It can construct and validate knowledge graphs with millions of nodes in seconds.
maplib allows you to leverage your existing skills with Pandas or Polars to extract and wrangle data from existing databases and spreadsheets, before applying simple templates to them to build a knowledge graph.
Template expansion is typically zero-copy and nearly instantaneous, and the built-in SPARQL and SHACL engines means you can query, inspect, enrich and validate the knowledge graph immediately.
maplib is written in Rust, it is built on Apache Arrow using Pola.rs and uses libraries from Oxigraph for handling linked data as well as parsing SPARQL queries.
The package is published on PyPi and the API documented here:
pip install maplib
Please send us a message, e.g. on LinkedIn (search for Data Treehouse) or on our webpage if you want to try out SHACL.
We can easily map DataFrames to RDF-graphs using the Python library. Below is a reproduction of the example in the paper [1]. Assume that we have a DataFrame given by:
import polars as pl
pl.Config.set_fmt_str_lengths(150)
pi = "https://github.com/DataTreehouse/maplib/pizza#"
df = pl.DataFrame({
"p":[pi + "Hawaiian", pi + "Grandiosa"],
"c":[pi + "CAN", pi + "NOR"],
"ings": [[pi + "Pineapple", pi + "Ham"],
[pi + "Pepper", pi + "Meat"]]
})
print(df)
That is, our DataFrame is:
p | c | ings |
---|---|---|
str | str | list[str] |
"https://.../pizza#Hawaiian" | "https://.../maplib/pizza#CAN" | [".../pizza#Pineapple", ".../pizza#Ham"] |
"https://.../pizza#Grandiosa" | "https://.../maplib/pizza#NOR" | [".../pizza#Pepper", ".../pizza#Meat"] |
Then we can define a OTTR template, and create our knowledge graph by expanding this template with our DataFrame as input:
from maplib import Mapping, Prefix, Template, Argument, Parameter, Variable, RDFType, Triple, a
pi = Prefix("pi", pi)
p_var = Variable("p")
c_var = Variable("c")
ings_var = Variable("ings")
template = Template(
iri= pi.suf("PizzaTemplate"),
parameters= [
Parameter(variable=p_var, rdf_type=RDFType.IRI()),
Parameter(variable=c_var, rdf_type=RDFType.IRI()),
Parameter(variable=ings_var, rdf_type=RDFType.Nested(RDFType.IRI()))
],
instances= [
Triple(p_var, a(), pi.suf("Pizza")),
Triple(p_var, pi.suf("fromCountry"), c_var),
Triple(
p_var,
pi.suf("hasIngredient"),
Argument(term=ings_var, list_expand=True),
list_expander="cross")
]
)
m = Mapping()
m.expand(template, df)
hpizzas = """
PREFIX pi:<https://github.com/DataTreehouse/maplib/pizza#>
CONSTRUCT { ?p a pi:HeterodoxPizza }
WHERE {
?p a pi:Pizza .
?p pi:hasIngredient pi:Pineapple .
}"""
m.insert(hpizzas)
return m
We can immediately query the mapped knowledge graph:
m.query("""
PREFIX pi:<https://github.com/DataTreehouse/maplib/pizza#>
SELECT ?p ?i WHERE {
?p a pi:Pizza .
?p pi:hasIngredient ?i .
}
""")
The query gives the following result (a DataFrame):
Next, we are able to perform a construct query, which creates new triples but does not insert them.
hpizzas = """
PREFIX pi:<https://github.com/DataTreehouse/maplib/pizza#>
CONSTRUCT { ?p a pi:UnorthodoxPizza }
WHERE {
?p a pi:Pizza .
?p pi:hasIngredient pi:Pineapple .
}"""
res = m.query(hpizzas)
res[0]
The resulting triples are given below:
subject | verb | object |
---|---|---|
str | str | str |
"https://.../pizza#Hawaiian" | "http://.../22-rdf-syntax-ns#type" | "https://.../pizza#UnorthodoxPizza" |
If we are happy with the output of this construct-query, we can insert it in the mapping state. Afterwards we check that the triple is added with a query.
m.insert(hpizzas)
m.query("""
PREFIX pi:<https://github.com/DataTreehouse/maplib/pizza#>
SELECT ?p WHERE {
?p a pi:UnorthodoxPizza
}
""")
Indeed, we have added the triple:
p |
---|
str |
"https://github.com/DataTreehouse/maplib/pizza#Hawaiian" |
The API is simple, and contains only one class and a few methods for:
The API is documented HERE
There is an associated paper [1] with associated benchmarks showing superior performance and scalability that can be found here. OTTR is described in [2].
[1] M. Bakken, "maplib: Interactive, literal RDF model mapping for industry," in IEEE Access, doi: 10.1109/ACCESS.2023.3269093.
[2] M. G. Skjæveland, D. P. Lupp, L. H. Karlsen, and J. W. Klüwer, “Ottr: Formal templates for pattern-based ontology engineering.” in WOP (Book), 2021, pp. 349–377.
All code produced since August 1st. 2023 is copyrighted to Data Treehouse AS with an Apache 2.0 license unless otherwise noted.
All code which was produced before August 1st. 2023 copyrighted to Prediktor AS with an Apache 2.0 license unless otherwise noted, and has been financed by The Research Council of Norway (grant no. 316656) and Prediktor AS as part of a PhD Degree. The code at this state is archived in the repository at https://github.com/magbak/maplib.
FAQs
Dataframe-based interactive knowledge graph construction
We found that maplib demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
Security News
Research
Socket researchers have discovered malicious npm packages targeting crypto developers, stealing credentials and wallet data using spyware delivered through typosquats of popular cryptographic libraries.
Security News
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.