obonet: load OBO-formatted ontologies into networkx




Read OBO-formatted ontologies in Python.
obonet
is
- user friendly
- succinct
- pythonic
- modern
- simple and tested
- lightweight
networkx
leveraging
This Python package loads OBO serialized ontologies into networks.
The function obonet.read_obo()
takes an .obo
file and returns a networkx.MultiDiGraph
representation of the ontology.
The parser was designed for the OBO specification version 1.2 & 1.4.
Usage
See pyproject.toml
for the minimum Python version required and the dependencies.
OBO files can be read from a path, URL, or open file handle.
Compression is inferred from the path's extension.
See example usage below:
import networkx
import obonet
url = 'https://github.com/dhimmel/obonet/raw/main/tests/data/taxrank.obo'
graph = obonet.read_obo(url)
url = 'https://github.com/dhimmel/obonet/raw/main/tests/data/taxrank.obo.xz'
graph = obonet.read_obo(url)
len(graph)
graph.number_of_edges()
networkx.is_directed_acyclic_graph(graph)
id_to_name = {id_: data.get('name') for id_, data in graph.nodes(data=True)}
id_to_name['TAXRANK:0000006']
networkx.descendants(graph, 'TAXRANK:0000006')
For a more detailed tutorial, see the Gene Ontology example notebook.
Comparison
This package specializes in reading OBO files into a newtorkx.MultiDiGraph
.
A more general ontology-to-NetworkX reader is available in the Python nxontology package via the nxontology.imports.pronto_to_multidigraph
function.
This function takes a pronto.Ontology
object,
which can be loaded from an OBO file, OBO Graphs JSON file, or Ontology Web Language 2 RDF/XML file (OWL).
Using pronto_to_multidigraph
allows creating a MultiDiGraph similar to the created by obonet
,
with some differences in the amount of metadata retained.
The primary focus of the nxontology
package is to provide an NXOntology
class for representing ontologies based around a networkx.DiGraph
.
NXOntology provides optimized implementations for computing node similarity and other intrinsic ontology metrics.
There are two important differences between a DiGraph for NXOntology and the MultiDiGraph produced by obonet:
-
NXOntology is based on a DiGraph that does not allow multiple edges between the same two nodes.
Multiple edges between the same two nodes must therefore be collapsed.
By default, it only considers is a / rdfs:subClassOf
relationships,
but using pronto_to_multidigraph
to create the NXOntology allows for retaining additional relationship types, like part of in the case of the Gene Ontology.
-
NXOntology reverses the direction of relationships so edges go from superterm to subterm.
Traditionally in ontologies, the is a relationships go from subterm to superterm,
but this is confusing.
NXOntology reverses edges so functions such as ancestors refer to more general concepts and descendants refer to more specific concepts.
The nxontology.imports.multidigraph_to_digraph
function converts from a MultiDiGraph, like the one produced by obonet, to a DiGraph by filtering to the desired relationship types, reversing edges, and collapsing parallel edges.
Installation
The recommended approach is to install the latest release from PyPI using:
pip install obonet
However, if you'd like to install the most recent version from GitHub, use:
pip install git+https://github.com/dhimmel/obonet.git
Contributing

We welcome feature suggestions and community contributions.
Currently, only reading OBO files is supported.
Develop
Some development commands:
python3 -m venv ./env
source env/bin/activate
pip install --editable ".[dev]"
pre-commit install
pre-commit run --all
pytest
git fetch --tags origin main
OLD_TAG=$(git describe --tags --abbrev=0)
git log --oneline --decorate=no --reverse $OLD_TAG..HEAD
Maintainers can make a new release at https://github.com/dhimmel/obonet/releases/new.