Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

semsql

Package Overview
Dependencies
Maintainers
3
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

semsql

  • 0.3.3
  • PyPI
  • Socket score

Maintainers
3

SemSQL: standard SQL views for RDF/OWL ontologies

PyPI version

This project provides a standard collection of SQL tables/views for ontologies, such that you can make queries like this, to find all terms starting with Abnormality in HPO.

$ sqlite db/hp.db
sqlite> SELECT * FROM rdfs_label_statement WHERE value LIKE 'Abnormality of %';
stanzasubjectpredicateobjectvaluedatatypelanguage
HP:0000002HP:0000002rdfs:labelAbnormality of body heightxsd:string
HP:0000014HP:0000014rdfs:labelAbnormality of the bladderxsd:string
HP:0000022HP:0000022rdfs:labelAbnormality of male internal genitaliaxsd:string
HP:0000032HP:0000032rdfs:labelAbnormality of male external genitaliaxsd:string

Ready-made SQLite3 builds can also be downloaded for any ontology in OBO, using URLs such as https://s3.amazonaws.com/bbop-sqlite/hp.db

relation-graph is used to pre-generate tables of entailed edges. For example, all is-a and part-of ancestors of finger in Uberon:

$ sqlite db/uberon.db
sqlite> SELECT * FROM entailed_edge WHERE subject='UBERON:0002389' and predicate IN ('rdfs:subClassOf', 'BFO:0000050');
subject, predicate, object
UBERON:0002389, BFO:0000050, UBERON:0015212
UBERON:0002389, BFO:0000050, UBERON:5002389
UBERON:0002389, BFO:0000050, UBERON:5002544
UBERON:0002389, rdfs:subClassOf, UBERON:0000061
UBERON:0002389, rdfs:subClassOf, UBERON:0000465
UBERON:0002389, rdfs:subClassOf, UBERON:0000475

SQLite provides many advantages

  • files can be downloaded and subsequently queried without network latency
  • compared to querying a static rdf, owl, or obo file, there is no startup/parse delay
  • robust and performant
  • excellent support in many languages

Although the focus is on SQLite, this library can also be used for other DBMSs like PostgreSQL, MySQL, Oracle, etc

Tutorials

Installation

SemSQL comes with a helper Python library. Use of this is optional. To install:

pip install semsql

Download ready-made SQLite databases

Pre-generated SQLite database are created weekly for all OBO ontologies and a selection of others (see ontologies.yaml)

To download:

semsql download obi -o obi.db

Or simply download using URL of the form:

Attaching databases

If you are using sqlite3, then databases can be attached to facilitate cross-database joins.

For example, many ontologies use ORCID URIs as the object of dcterms:contributor and dcterms:creator statements, but these are left "dangling". Metadata about these orcids are available in the semsql orcid database instance (derived from wikidata-orcid-ontology), in the Orcid table.

You can use ATTACH DATABASE to connect two databases, for example:

$ sqlite3 db/cl.dl
sqlite> attach 'db/orcid.db' as orcid_db;
sqlite> select * from contributor inner join orcid_db.orcid on (orcid.id=contributor.object) where orcid.label like 'Chris%';
obo:cl.owl|obo:cl.owl|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
CL:0010001|CL:0010001|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
CL:0010002|CL:0010002|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
CL:0010003|CL:0010003|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
CL:0010004|CL:0010004|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
UBERON:0000093|UBERON:0000093|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
UBERON:0000094|UBERON:0000094|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
UBERON:0000095|UBERON:0000095|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
UBERON:0000179|UBERON:0000179|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
UBERON:0000201|UBERON:0000201|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
UBERON:0000202|UBERON:0000202|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
UBERON:0000203|UBERON:0000203|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall
UBERON:0000204|UBERON:0000204|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall

Creating a SQLite database from an OWL file

There are two protocols for doing this:

  1. install build dependencies
  2. use Docker

In either case:

  • The input MUST be in RDF/XML serialization and have the suffix .owl:
  • use robot to convert if format is different

We are planning to simplify this process in future.

1. Build a SQLite database directly

This requires some basic technical knowledge about how to install things on your machine and how to put things in your PATH. It does not require Docker.

Requirements:

After installing these and putting both relation-graph and rdftab.rs in your path:

semsql make foo.db

This assumes foo.owl is in the same folder

2. Use Docker

There are two docker images that can be used:

The ODK image may lag behind

docker run  -v $PWD:/work -w /work -ti linkml/semantic-sql semsql make foo.db

Schema

See Schema Documentation

The source schema is in LinkML - this is then compiled down to SQL Tables and Views

The basic idea is as follows:

There are a small number of "base tables":

All other tables are actually views (derived tables), and are provided for convenience.

ORM Layer

A SemSQL relational database can be accessed in exactly the same way as any other SQLdb

For convenience, we provide a Python Object-Relational Mapping (ORM) layer using SQL Alchemy. This allows for code uchlike the following, which joins RdfsSubclassOfStatement and existential restrictions:

engine = create_engine(f"sqlite:////path/to/go.db")
SessionClass = sessionmaker(bind=engine)
session = SessionClass()
q = session.query(RdfsSubclassOfStatement)
q = q.add_entity(OwlSomeValuesFrom)
q = q.join(OwlSomeValuesFrom, RdfsSubclassOfStatement.object == OwlSomeValuesFrom.id)

lines = []
for ax, ex in q.all():
    line = f'{ax.subject} subClassOf {ex.on_property} SOME {ex.filler}'
    logging.info(line)
    lines.append(line)

(this example is just for illustration - to do the same thing there is a simpler Edge relation)

Applications

The semsql python library is intentionally low level - we recommend using the ontology-access-kit

For example:

runoak -i db/envo.db search t~biome

You can also pass in an OWL file and have the sqlite be made on the fly

runoak -i sqlite:envo.owl search t~biome

Even if using OAK, it can be useful to access SQL tables directly to do complex multi-join queries in a performant way.

Optimization

poetry run semsql view2table edge --full-index | sqlite3 $db/mydb.db

See indexes for some ready-made indexes

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc