biomedical_id_resolver.js
js library for resolving biological ids to their equivalent ids in batch
Install
$ npm i biomedical_id_resolver
Usage
const resolve = require('biomedical_id_resolver');
let input = {
"Gene": ["NCBIGene:1017", "NCBIGene:1018", "HGNC:1177"],
"ChemicalSubstance": ["CHEBI:15377"],
"Disease": ["MONDO:0004976"],
"Cell": ["CL:0002372"]
};
(async () => {
console.log(await resolve(input);
})();
Output Schema
-
Output is a javascript Object
-
The root keys are CURIES (e.g. NCBIGene:1017) which are passed in as input
-
The values represents resolved identifiers
-
Each CURIE will have 4 required fields
-
id: the primary id (selected based on the ranking described in the next section) and label
-
curies: an array, each element represents a resolved id in CURIE format
-
type: the semantic type of the identifier
-
db_ids: original ids from source database, could be curies or non-curies.
-
if an ID can not be resolved using the package, it will have an additional field called "flag", with value equal to "failed"
-
Example Output
{
"NCBIGene:1017": {
"id": {
"label": "cyclin dependent kinase 2",
"identifier": "NCBIGene:1017"
},
"db_ids": {
"NCBIGene": [
"1017"
],
"ENSEMBL": [
"ENSG00000123374"
],
"HGNC": [
"1771"
],
"SYMBOL": [
"CDK2"
],
"UMLS": [
"C1332733",
"C0108855"
],
"name": [
"cyclin dependent kinase 2"
]
},
"type": "Gene",
"curies": [
"NCBIGene:1017",
"ENSEMBL:ENSG00000123374",
"HGNC:1771",
"SYMBOL:CDK2",
"UMLS:C1332733",
"UMLS:C0108855"
]
}
}
Available Semantic Types & prefixes
Gene ID resolution is done through MyGene.info API
- Gene
- NCBIGene
- ENSEMBL
- HGNC
- SYMBOL
- OMIM
- UniProtKB
- UMLS
- name
Variant ID resolution is done through MyVariant.info API
- SequenceVariant
- HGVS
- DBSNP
- MYVARIANT_HG19
- ClinVar
ChemicalSubstance ID resolution is done through MyChem.info API
- ChemicalSubstance
- CHEBI
- CHEMBL.COMPOUND
- DRUGBANK
- PUBCHEM
- MESH
- INCHI
- INCHIKEY
- UNII
- KEGG
- UMLS
- name
- id
Disease ID Resolution is done through MyDisease.info API
-
Disease
- MONDO
- DOID
- OMIM
- ORPHANET
- EFO
- UMLS
- MESH
- name
Pathway ID Resolution is done through biothings.ncats.io/geneset API
- Pathway
- Reactome
- KEGG
- PHARMGKB
- WIKIPATHWAYS
- name
MolecularActivity ID Resolution is done through BioThings Gene Ontology Molecular Activity API
- MolecularActivity
- GO
- MetaCyc
- RHEA
- KEGG.REACTION
- Reactome
CellularComponent ID Resolution is done through BioThings Gene Ontology Cellular Component API
- CellularComponent
- GO
- MESH
- UMLS
- NCIT
- SNOMEDCT
- UBERON
- CL
- name
BiologicalProcess ID Resolution is done through BioThings Gene Ontology Biological Process API
-
BiologicalProcess
- GO
- MetaCyc
- Reactome
- name
AnatomicalEntity ID Resolution is done through BioThings UBERON API
- AnatomicalEntity
- UBERON
- UMLS
- NCIT
- MESH
- name
PhenotypicFeature ID Resolution is done through BioThings HPO API
- PhenotypicFeature
- UMLS
- SNOMEDCT
- HP
- MEDDRA
- EFO
- NCIT
- MESH
- MP
- name
Cell ID Resolution is done through nodenormalization API
- Cell
- CL
- UMLS
- NCIT
- MESH
- UBERON
- SNOMEDCT
- name
Development
- Install Node 12 or later. You can use the package manager of your choice.
Tests need to pass in Node 12 and 14.
- Clone this repository.
- Run
npm ci
to install the dependencies. - scripts are stored in
/src
folder - Add test to
/__tests__
folder - run
npm run release
to bump version and generte change log - run
npx depcheck
to check for unused packages in package.json
CHANGELOG
See CHANGELOG.md