Security News
pnpm 10.0.0 Blocks Lifecycle Scripts by Default
pnpm 10 blocks lifecycle scripts by default to improve security, addressing supply chain attack risks but sparking debate over compatibility and workflow changes.
Read and write GFF3 data performantly. This module aims to be a complete implementation of the GFF3 specification.
Parent
and Derives_from
relationshipsParent
and Derives_from
relationships$ npm install --save @gmod/gff
const gff = require('@gmod/gff').default
// or in ES6 (recommended)
import gff from '@gmod/gff'
const fs = require('fs')
// parse a file from a file name
// parses only features and sequences by default,
// set options to parse directives and/or comments
fs.createReadStream('path/to/my/file.gff3')
.pipe(gff.parseStream({ parseAll: true }))
.on('data', (data) => {
if (data.directive) {
console.log('got a directive', data)
} else if (data.comment) {
console.log('got a comment', data)
} else if (data.sequence) {
console.log('got a sequence from a FASTA section')
} else {
console.log('got a feature', data)
}
})
// parse a string of gff3 synchronously
const stringOfGFF3 = fs.readFileSync('my_annotations.gff3').toString()
const arrayOfThings = gff.parseStringSync(stringOfGFF3)
// format an array of items to a string
const newStringOfGFF3 = gff.formatSync(arrayOfThings)
// format a stream of things to a stream of text.
// inserts sync marks automatically.
myStreamOfGFF3Objects
.pipe(gff.formatStream())
.pipe(fs.createWriteStream('my_new.gff3'))
// format a stream of things and write it to
// a gff3 file. inserts sync marks and a
// '##gff-version 3' header if one is not
// already present
gff.formatFile(
myStreamOfGFF3Objects,
fs.createWriteStream('my_new_2.gff3', { encoding: 'utf8' }),
)
In GFF3, features can have more than one location. We parse features
as arrayrefs of all the lines that share that feature's ID.
Values that are .
in the GFF3 are null
in the output.
A simple feature that's located in just one place:
[
{
"seq_id": "ctg123",
"source": null,
"type": "gene",
"start": 1000,
"end": 9000,
"score": null,
"strand": "+",
"phase": null,
"attributes": {
"ID": [
"gene00001"
],
"Name": [
"EDEN"
]
},
"child_features": [],
"derived_features": []
}
]
A CDS called cds00001
located in two places:
[
{
"seq_id": "ctg123",
"source": null,
"type": "CDS",
"start": 1201,
"end": 1500,
"score": null,
"strand": "+",
"phase": "0",
"attributes": {
"ID": ["cds00001"],
"Parent": ["mRNA00001"]
},
"child_features": [],
"derived_features": []
},
{
"seq_id": "ctg123",
"source": null,
"type": "CDS",
"start": 3000,
"end": 3902,
"score": null,
"strand": "+",
"phase": "0",
"attributes": {
"ID": ["cds00001"],
"Parent": ["mRNA00001"]
},
"child_features": [],
"derived_features": []
}
]
parseDirective("##gff-version 3\n")
// returns
{
"directive": "gff-version",
"value": "3"
}
parseDirective('##sequence-region ctg123 1 1497228\n')
// returns
{
"directive": "sequence-region",
"value": "ctg123 1 1497228",
"seq_id": "ctg123",
"start": "1",
"end": "1497228"
}
parseComment('# hi this is a comment\n')
// returns
{
"comment": "hi this is a comment"
}
These come from any embedded ##FASTA
section in the GFF3 file.
parseSequences(`##FASTA
>ctgA test contig
ACTGACTAGCTAGCATCAGCGTCGTAGCTATTATATTACGGTAGCCA`)
// returns
[
{
"id": "ctgA",
"description": "test contig",
"sequence": "ACTGACTAGCTAGCATCAGCGTCGTAGCTATTATATTACGGTAGCCA"
}
]
Parser options
Text encoding of the input GFF3. default 'utf8'
Type: BufferEncoding
Whether to parse features, default true
Type: boolean
Whether to parse directives, default false
Type: boolean
Whether to parse comments, default false
Type: boolean
Whether to parse sequences, default true
Type: boolean
Parse all features, directives, comments, and sequences. Overrides other parsing options. Default false.
Type: boolean
Maximum number of GFF3 lines to buffer, default 1000
Type: number
Parse a stream of text data into a stream of feature, directive, comment, an sequence objects.
options
ParseOptions Parsing options (optional, default {}
)Returns GFFTransform stream (in objectMode) of parsed items
Synchronously parse a string containing GFF3 and return an array of the parsed items.
str
string GFF3 stringinputOptions
({encoding: BufferEncoding?, bufferSize: number?} | undefined)? Parsing optionsReturns Array<(GFF3Feature | GFF3Sequence)> array of parsed features, directives, comments and/or sequences
Format an array of GFF3 items (features,directives,comments) into string of GFF3. Does not insert synchronization (###) marks.
items
Array<GFF3Item> Array of features, directives, comments and/or sequencesReturns string the formatted GFF3
Format a stream of features, directives, comments and/or sequences into a stream of GFF3 text.
Inserts synchronization (###) marks automatically.
options
FormatOptions parser options (optional, default {}
)Returns FormattingTransform
Format a stream of features, directives, comments and/or sequences into a GFF3 file and write it to the filesystem.
Inserts synchronization (###) marks and a ##gff-version directive automatically (if one is not already present).
stream
Readable the stream to write to the filewriteStream
Writableoptions
FormatOptions parser options (optional, default {}
)filename
the file path to write toReturns Promise<null> promise for null that resolves when the stream has been written
util
There is also a util
module that contains super-low-level functions for dealing with lines and parts of lines.
// non-ES6
const util = require('@gmod/gff').default.util
// or, with ES6
import gff from '@gmod/gff'
const util = gff.util
const gff3Lines = util.formatItem({
seq_id: 'ctgA',
...
}))
Unescape a string value used in a GFF3 attribute.
stringVal
string Escaped GFF3 string valueReturns string An unescaped string value
Escape a value for use in a GFF3 attribute value.
Returns string An escaped string value
Escape a value for use in a GFF3 column value.
Returns string An escaped column value
Parse the 9th column (attributes) of a GFF3 feature line.
attrString
string String of GFF3 9th columnReturns GFF3Attributes Parsed attributes
Parse a GFF3 feature line
line
string GFF3 feature lineReturns GFF3FeatureLine The parsed feature
Parse a GFF3 directive line.
line
string GFF3 directive lineReturns (GFF3Directive | GFF3SequenceRegionDirective | GFF3GenomeBuildDirective | null) The parsed directive
Format an attributes object into a string suitable for the 9th column of GFF3.
attrs
GFF3Attributes AttributesReturns string GFF3 9th column string
Format a feature object or array of feature objects into one or more lines of GFF3.
featureOrFeatures
(GFF3FeatureLine | GFF3FeatureLineWithRefs | Array<(GFF3FeatureLine | GFF3FeatureLineWithRefs)>) A feature object or array of feature objectsReturns string A string of one or more GFF3 lines
Format a directive into a line of GFF3.
directive
GFF3Directive A directive objectReturns string A directive line string
Format a comment into a GFF3 comment. Yes I know this is just adding a # and a newline.
comment
GFF3Comment A comment objectReturns string A comment line string
Format a sequence object as FASTA
seq
GFF3Sequence A sequence objectReturns string Formatted single FASTA sequence string
Format a directive, comment, sequence, or feature, or array of such items, into one or more lines of GFF3.
itemOrItems
(GFF3FeatureLineWithRefs | GFF3Directive | GFF3Comment | GFF3Sequence | Array<(GFF3FeatureLineWithRefs | GFF3Directive | GFF3Comment | GFF3Sequence)>) A comment, sequence, or feature, or array of such itemsReturns (string | Array<string>) A formatted string or array of strings
A record of GFF3 attribute identifiers and the values of those identifiers
Type: Record<string, (Array<string> | undefined)>
A representation of a single line of a GFF3 file
The ID of the landmark used to establish the coordinate system for the current feature
Type: (string | null)
A free text qualifier intended to describe the algorithm or operating procedure that generated this feature
Type: (string | null)
The type of the feature
Type: (string | null)
The start coordinates of the feature
Type: (number | null)
The end coordinates of the feature
Type: (number | null)
The score of the feature
Type: (number | null)
The strand of the feature
Type: (string | null)
For features of type "CDS", the phase indicates where the next codon begins relative to the 5' end of the current CDS feature
Type: (string | null)
Feature attributes
Type: (GFF3Attributes | null)
Extends GFF3FeatureLine
A GFF3 Feature line that includes references to other features defined in their "Parent" or "Derives_from" attributes
An array of child features
Type: Array<GFF3Feature>
An array of features derived from this feature
Type: Array<GFF3Feature>
A GFF3 feature, which may include multiple individual feature lines
Type: Array<GFF3FeatureLineWithRefs>
A GFF3 directive
The name of the directive
Type: string
The string value of the directive
Type: string
Extends GFF3Directive
A GFF3 sequence-region directive
The string value of the directive
Type: string
The sequence ID parsed from the directive
Type: string
The sequence start parsed from the directive
Type: string
The sequence end parsed from the directive
Type: string
Extends GFF3Directive
A GFF3 genome-build directive
The string value of the directive
Type: string
The genome build source parsed from the directive
Type: string
The genome build name parsed from the directive
Type: string
A GFF3 comment
The text of the comment
Type: string
A GFF3 FASTA single sequence
The ID of the sequence
Type: string
The description of the sequence
Type: string
The sequence
Type: string
MIT © Robert Buels
FAQs
read and write GFF3 data as streams
The npm package @gmod/gff receives a total of 3,845 weekly downloads. As such, @gmod/gff popularity was classified as popular.
We found that @gmod/gff demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 6 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
pnpm 10 blocks lifecycle scripts by default to improve security, addressing supply chain attack risks but sparking debate over compatibility and workflow changes.
Product
Socket now supports uv.lock files to ensure consistent, secure dependency resolution for Python projects and enhance supply chain security.
Research
Security News
Socket researchers have discovered multiple malicious npm packages targeting Solana private keys, abusing Gmail to exfiltrate the data and drain Solana wallets.