Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

@gmod/cram

Package Overview
Dependencies
Maintainers
3
Versions
51
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@gmod/cram

read CRAM files with pure Javascript

  • 1.0.0
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
602
decreased by-23.02%
Maintainers
3
Weekly downloads
 
Created
Source

@gmod/cram

Generated with nod NPM version Build Status Coverage Status

Read CRAM files (indexed or unindexed) with pure JS, works in node or in the browser.

  • Reads CRAM 3.x and 2.x
  • Does not read CRAM 1.x
  • Can use .crai indexes out of the box, for efficient sequence fetching, but also has an index API that would allow use with other index types

Install

$ npm install --save @gmod/cram
# or
$ yarn add @gmod/cram

Usage

const { IndexedCramFile, CramFile, CraiIndex } = require('@gmod/cram')

// open local files
const indexedFile = new IndexedCramFile({
  cramPath: require.resolve(`./data/ce#5.tmp.cram`),
  index: new CraiIndex({
    path: require.resolve(`./data/ce#5.tmp.cram.crai`),
  }),
  seqFetch: async (seqId, start, end) => {
    // seqFetch should return a promise for a string.
    // this one just returns a fake sequence of all A's of the proper length
    let fakeSeq = ''
    for (let i = start; i <= end; i += 1) {
      fakeSeq += 'A'
    }
    return fakeSeq
  },
  checkSequenceMD5: false,
})

// example of fetching records from an indexed CRAM file.
// NOTE: only numeric IDs for the reference sequence are accepted
const records = await indexedFile.getRecordsForRange(0, 10000, 20000)
records.forEach(record => {
  console.log(`got a record named ${record.readName}`)
  record.readFeatures.forEach(({ code, pos, refPos, ref, sub }) => {
    if (code === 'X')
      console.log(
        `${
          record.readName
        } shows a base substitution of ${ref}->${sub} at ${refPos}`,
      )
  })
})


// can also pass `cramUrl`, and `url` params to open remote URLs

API (auto-generated)

CramRecord

These are the record objects returned by this API. Much of the data is stored in them as simple object entries, but there are some accessor methods used for conveniently getting the values of each of the flags in the flags and cramFlags fields.

Static fields
  • flags (number): the SAM bit-flags field, see the SAM spec for interpretation. Some of the is* methods below interpret this field.
  • cramFlags (number): the CRAM-specific bit-flags field, see the CRAM spec for interpretation. Some of the is* methods below interpret this field.
  • sequenceId (number): the ID number of the record's reference sequence
  • readLength (number): length of the read in bases
  • alignmentStart (number): start coordinate of the alignment on the reference in 1-based closed coordinates
  • readGroupId (number): ID number of the read group, or -1 if none
  • readName (number): name of the read (string)
  • templateSize (number): for paired sequencing, the total size of the template
  • readFeatures (array[ReadFeature]): array of read features showing insertions, deletions, mismatches, etc. See ReadFeatures for their format.
  • lengthOnRef (number): span of the alignment along the reference sequence
  • mappingQuality (number): SAM mapping quality
  • qualityScores (array[number]): array of numeric quality scores
  • uniqueId (number): unique ID number of the record within the file
  • mate (object)
    • flags (number): CRAM mapping flags for the mate. See CRAM spec for interpretation. Some of the is* methods below interpret this field.
    • sequenceId (number): reference sequence ID for the mate mapping
    • alignmentStart (number): start coordinate of the mate mapping. 1-based coordinates.
Methods
isPaired

Returns boolean true if the read is paired, regardless of whether both segments are mapped

isProperlyPaired

Returns boolean true if the read is paired, and both segments are mapped

isSegmentUnmapped

Returns boolean true if the read itself is unmapped; conflictive with isProperlyPaired

isMateUnmapped

Returns boolean true if the read itself is unmapped; conflictive with isProperlyPaired

isReverseComplemented

Returns boolean true if the read is mapped to the reverse strand

isMateReverseComplemented

Returns boolean true if the mate is mapped to the reverse strand

isRead1

Returns boolean true if this is read number 1 in a pair

isRead2

Returns boolean true if this is read number 2 in a pair

isSecondary

Returns boolean true if this is a secondary alignment

isFailedQc

Returns boolean true if this read has failed QC checks

isDuplicate

Returns boolean true if the read is an optical or PCR duplicate

isSupplementary

Returns boolean true if this is a supplementary alignment

isDetached

Returns boolean true if the read is detached

hasMateDownStream

Returns boolean true if the read has a mate in this same CRAM segment

isPreservingQualityScores

Returns boolean true if the read contains qual scores

isUnknownBases

Returns boolean true if the read has no sequence bases

addReferenceSequence

annotates this feature with the given reference region. right now, this only uses the reference sequence to decode which bases are being substituted in base substitution features.

Parameters

  • refRegion object
  • compressionScheme CramContainerCompressionScheme

Returns undefined nothing

ReadFeatures

The feature objects appearing in the readFeatures member of CramRecord objects that show insertions, deletions, substitutions, etc.

Static fields
  • code (character): One of "bqBXIDiQNSPH". See page 15 of the CRAM v3 spec for their meanings.
  • data (any): the data associated with the feature. The format of this varies depending on the feature code.
  • pos (number): location relative to the read (1-based)
  • refPos (number): location relative to the reference (1-based)

IndexedCramFile

The pairing of an index and a CramFile. Supports efficient fetching of records for sections of reference sequences.

Table of Contents
constructor

Parameters

  • args object
    • args.cram CramFile
    • args.index Index-like object that supports getEntriesForRange(seqId,start,end) -> Promise[Array[index entries]]
    • args.cacheSize number? optional maximum number of CRAM records to cache. default 20,000
    • args.fetchSizeLimit number? optional maximum number of bytes to fetch in a single getRecordsForRange call. Default 3 MiB.
    • args.checkSequenceMD5 boolean? default true. if false, disables verifying the MD5 checksum of the reference sequence underlying a slice. In some applications, this check can cause an inconvenient amount (many megabases) of sequences to be fetched.
getRecordsForRange

Parameters

  • seq number numeric ID of the reference sequence
  • start number start of the range of interest. 1-based closed coordinates.
  • end number end of the range of interest. 1-based closed coordinates.
hasDataForReferenceSequence

Parameters

Returns Promise true if the CRAM file contains data for the given reference sequence numerical ID

CramFile

Table of Contents
constructor

Parameters

  • args object
    • args.filehandle object? a filehandle that implements the stat() and read() methods of the Node filehandle API https://nodejs.org/api/fs.html#fs_class_filehandle
    • args.path object? path to the cram file
    • args.url object? url for the cram file. also supports file:// urls for local files
    • args.seqFetch function? a function with signature (seqId, startCoordinate, endCoordinate) that returns a promise for a string of sequence bases
    • args.cacheSize number? optional maximum number of CRAM records to cache. default 20,000
    • args.checkSequenceMD5 boolean? default true. if false, disables verifying the MD5 checksum of the reference sequence underlying a slice. In some applications, this check can cause an inconvenient amount (many megabases) of sequences to be fetched.
containerCount

Returns number the number of containers in the file

CraiIndex

Represents a .crai index.

Table of Contents
constructor

Parameters

hasDataForReferenceSequence

Parameters

Returns Promise true if the index contains entries for the given reference sequence ID, false otherwise

getEntriesForRange

fetch index entries for the given range

Parameters

Returns Promise promise for an array of objects of the form {start, span, containerStart, sliceStart, sliceBytes }

Error Classes

@gmod/cram/errors contains some special error classes thrown by cram-js. A list of the error classes is below.

Table of Contents
CramUnimplementedError

Extends Error

Error caused by encountering a part of the CRAM spec that has not yet been implemented

CramMalformedError

Extends CramError

An error caused by malformed data.

CramBufferOverrunError

Extends CramMalformedError

An error caused by attempting to read beyond the end of the defined data.

CramSizeLimitError

Extends CramError

An error caused by data being too big, exceeding a size limit.

CramArgumentError

Extends CramError

An invalid argument was supplied to a cram-js method or object.

CramUnimplementedError

Extends Error

Error caused by encountering a part of the CRAM spec that has not yet been implemented

CramMalformedError

Extends CramError

An error caused by malformed data.

CramBufferOverrunError

Extends CramMalformedError

An error caused by attempting to read beyond the end of the defined data.

License

MIT © Robert Buels

Keywords

FAQs

Package last updated on 14 Jul 2018

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc