
Security News
MCP Community Begins Work on Official MCP Metaregistry
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
This project presents the SDM-RDFizer, an interpreter of mapping rules that allows the transformation of (un)structured data into RDF knowledge graphs. The current version of the SDM-RDFizer assumes mapping rules are defined in the RDF Mapping Language (RML) by Dimou et al.
This project presents the SDM-RDFizer, an interpreter of mapping rules that allows the transformation of (un)structured data into RDF knowledge graphs. The current version of the SDM-RDFizer assumes mapping rules are defined in the RDF Mapping Language (RML) by Dimou et al. The SDM-RDFizer implements optimized data structures and relational algebra operators that enable an efficient execution of RML triple maps even in the presence of Big data. SDM-RDFizer is able to process data from heterogeneous data sources (CSV, JSON, RDB, XML) processing each set of RML rules (TriplesMap) in a multi-thread safe procedure.
In version 4.0 of SDM-RDFizer, we have addressed the problem of efficiency in KG creation in terms of memory storage. SDM-RDFizer version4.0 includes a new module called "TriplesMap Planning" a.k.a. TMP which defines an optimized evaluation plan for the execution of triples maps. Additionally, version 4.0 extends the previously included module (i.e. TriplesMap Execution a.k.a. TME) by introducing a new operator for compressing data stored in the data structures. These new features can be configured using two new parameters added to the configuration file, named "large_file" and "ordered".
We have performed extensive empirical evaluation on SDM-RDFizer version4.0 in terms of execution time and memory usage. The experiments are set up to empirically compare the impact of data duplicate rates, data size, and the complexity and the execution order of the triples maps on two versions of SDM-RDFizer (i.e. version4.0 and version3.6) and other existing engines including RMLMapper v4.7 and RocketRML ), in terms of execution time and memory usage. The experiments are performed on two different benchmarks:
The results of explained experiments can be summarized as the following:
As observed in the figures above, both versions of SDM-RDFizer completed all the testbeds successfully while the other two engines have cases of timeout. SDM-RDFizer version3.6 and RocketRML version 1.7.0 are competitive in simple testbeds, however, SDM-RDFizer version4.0 shows the best performance in all the testbeds.
As illustrated in the figures above, SDM-RDFizer version4.0 has the smallest peak in memory usage compared to the previous version of SDM-RDFizer.
The results of the execution of SDM-RDFizer has been described in the following research reports:
Enrique Iglesias, Samaneh Jozashoori, David Chaves-Fraga, Diego Collarana, and Maria-Esther Vidal. 2020. SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs. The 29th ACM International Conference on Information and Knowledge Management (CIKM ’20).
Samaneh Jozashoori, David Chaves-Fraga, Enrique Iglesias, Oscar Corcho, and Maria-Esther Vidal. 2020. FunMap: Efficient Execution of Functional Mappings for Knowledge Graph Creation. The 19th International Semantic Web Conference - Research Track (ISWC 2020).
Samaneh Jozashoori and Maria-Esther Vidal. MapSDI: A Scaled-up Semantic Data Integration Framework for Knowledge Graph Creation. The 27th International Conference on Cooperative Information Systems (CoopIS 2019).
David Chaves-Fraga, Kemele M. Endris, Enrique Iglesias, Oscar Corcho, and Maria-Esther Vidal. What are the Parameters that Affect the Construction of a Knowledge Graph?. The 18th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE 2019).
David Chaves-Fraga, Antón Adolfo, Jhon Toledo, and Oscar Corcho. ONETT: Systematic Knowledge Graph Generation for National Access Points. The 1st International Workshop on Semantics for Transport co-located with SEMANTiCS 2019.
David Chaves-Fraga, Freddy Priyatna, Andrea Cimmino, Jhon Toledo, Edna Ruckhaus, and Oscar Corcho. GTFS-Madrid-Bench: A benchmark for virtual knowledge graph access in the transport domain. Journal of Web Semantics, 2020.
Additional References:
The SDM-RDFizer is used in the creation of the knowledge graphs of EU H2020 projects and national projects where the Scientific Data Management group participates. These projects include:
The SDM-RDFizer is also used in EU H2020, EIT-Digital and Spanish national projects where the Ontology Engineering Group (Technical University of Madrid) participates. These projects, mainly focused on the domain of transportation and smart cities, include:
Other projects were the SDM-RDFizer is also used:
From PyPI (https://pypi.org/project/rdfizer/):
python3 -m pip install rdfizer
python3 -m rdfizer -c /path/to/config/file
From Github/Docker: Visit the wiki of the repository to learn how to install and run the SDM-RDFizer. You can also take a look to our demo at: https://www.youtube.com/watch?v=DpH_57M1uOE
You can easily customize your own configurations from the set of features that SDM-RDFizer offers by changing the values of the parameters in the config file. The descriptions of each parameter and the possible values are provided here; "ordered" and "large_file" are the new features provided by SDM-RDFizer version 4.0.
4.7.5.1
See the results of the SDM-RDFizer over the RML test-cases at the RML Implementation Report. SDM-RDFizer version 4.0 is tested over the latest published test cases before the release.
See the results of the experimental evaluations of SDM-RDFizer version 3.* at SDM-RDFizer-Experiments repository
This work is licensed under Apache 2.0
1. Conference paper published as a resource at CIKM2020
2. Journal paper in Semantic Web Journal
What are the parameters that affect the construction of a knowledge graph?
FunMap: Efficient Execution of Functional Mappings for Knowledge Graph Creation
Scaling up knowledge graph creation to large and heterogeneous data sources
EABlock: a declarative entity alignment block for knowledge graph creation pipelines
InterpretME: A tool for interpretations of machine learning models over knowledge graphs
FlexRML: A Flexible and Memory Efficient Knowledge Graph Materializer
RMLStreamer-SISO: An RDF Stream Generator from Streaming Heterogeneous Data
Morph-KGC: Scalable knowledge graph materialization with mapping partitions
The SDM-RDFizer has been developed by members of the Scientific Data Management Group at TIB, as an ongoing research effort. The development is coordinated and supervised by Maria-Esther Vidal (maria.vidal@tib.eu). We strongly encourage you to please report any issues you have with the SDM-RDFizer. You can do that over our contact email or creating a new issue here on GitHub. The SDM-RDFizer has been implemented by Enrique Iglesias (current version, iglesias@l3s.de) and Guillermo Betancourt (version 0.1, guillermojbetancourt@gmail.com) under the supervision of Samaneh Jozashoori (samaneh.jozashoori@tib.eu), David Chaves-Fraga (dchaves@fi.upm.es), and Kemele Endris (kemele.endris@tib.eu)
FAQs
This project presents the SDM-RDFizer, an interpreter of mapping rules that allows the transformation of (un)structured data into RDF knowledge graphs. The current version of the SDM-RDFizer assumes mapping rules are defined in the RDF Mapping Language (RML) by Dimou et al.
We found that rdfizer demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
Research
Security News
Socket uncovers an npm Trojan stealing crypto wallets and BullX credentials via obfuscated code and Telegram exfiltration.
Research
Security News
Malicious npm packages posing as developer tools target macOS Cursor IDE users, stealing credentials and modifying files to gain persistent backdoor access.