CodonTableParser
Parses the NCBI genetic code table with a multiline Regex, generating hash maps of each species' name, start codons, stop codons and codon table.
The output can be easily customized and used to update the respective constants of BioRuby's CodonTable class whenever the original data changes.
Installation
$ gem install codon_table_parser
Usage
Without any parameters, the genetic code file is downloaded directly from the NCBI web site
parser = CodonTableParser.new
Alternatively, the genetic code file can be loaded from a path
file = 'path/to/genetic_code.txt'
parser = CodonTableParser.new(file)
The first line of the file is read to determine if the content is correct. If not, an exception is thrown:
wrong_content = 'path/to/wrong_content.txt'
parser = CodonTableParser.new(wrong_content)
Instance Methods
The following instance methods are available:
- CodonTableParser#definitions
- CodonTableParser#starts
- CodonTableParser#stops
- CodonTableParser#tables
- CodonTableParser#bundle
Every intance method can take a :range option that specifies the ids of the species to be considered in the output.
A range is specified as an array of integers, Ranges or both.
Example:
:range => [(1..3), 5, 9]
ids not present in the originial data are ignored.
Besides the :range option, several methods also take other options as demonstrated below.
CodonTableParser#definitions
parser = CodonTableParser.new
definitions = parser.definitions
definitions
definitions = parser.definitions :range => [(1..3), 5, 9]
definitions = parser.definitions :names => {1 => "Standard (Eukaryote)",
3 => "Yeast Mitochondorial"}
definitions[1]
definitions[3]
parser.definitions :range => [(1..3), 5, 9],
:names => {1 => "Standard (Eukaryote)",
3 => "Yeast Mitochondorial"}
CodonTableParser#starts
parser = CodonTableParser.new
start_codons = parser.starts
start_codons
start_codons = parser.starts :range => [(1..3), 5, 9]
start_codons = parser.starts 1 => {:add => ['gtg']},
13 => {:remove => ['ttg', 'ata', 'gtg']}
start_codons[1]
start_codons[13]
start_codons = parser.starts :starts => {1 => {:add => ['gtg']},
13 => {:remove => ['ttg', 'ata', 'gtg']}}
start_codons = parser.starts :range => [(1..3), 13],
1 => {:add => ['gtg']},
13 => {:remove => ['ttg', 'ata', 'gtg']}
CodonTableParser#stops
parser = CodonTableParser.new
stop_codons = parser.stops
stops
stop_codons = parser.stops :range => [(1..3), 5, 9]
stop_codons = parser.stops 1 => {:add => ['gtg'], :remove => ['taa']},
13 => {:add => ['gcc'], :remove => ['taa', 'tag']}
stop_codons[1]
stop_codons[13]
stop_codons = parser.stops :stops => {1 => {:add => ['gtg'], :remove => ['taa']},
13 => {:add => ['gcc'], :remove => ['taa', 'tag']}}
stop_codons = parser.stops :range => [(1..3), 5, 13],
1 => {:add => ['gtg'], :remove => ['taa']},
13 => {:add => ['gcc'], :remove => ['taa', 'tag']}
CodonTableParser#tables
parser = CodonTableParser.new
codon_tables = parser.tables
tables
codon_tables = parser.tables :range => [(1..3), 5, 9, 23]
CodonTableParser#bundle
parser = CodonTableParser.new
bundle = parser.bundle
bundle
The bundle method accepts all options from the methods described above, that is:
- :range (applied to all methods)
- :names (applied to the definitions method)
- :starts (applied to the starts method)
- :stops (applied to the stops method)
To return the same values as are assigned to the constants DEFINITIONS, STARTS, STOPS, and TABLES of BioRuby's CodonTable class, calling bundle with the following options will do:
bundle = parser.bundle :names => {1 => "Standard (Eukaryote)",
4 => "Mold, Protozoan, Coelenterate Mitochondrial and Mycoplasma/Spiroplasma",
3 => "Yeast Mitochondorial",
6 => "Ciliate Macronuclear and Dasycladacean",
9 => "Echinoderm Mitochondrial",
11 => "Bacteria",
14 => "Flatworm Mitochondrial",
22 => "Scenedesmus obliquus mitochondrial"},
:starts => {1 => {:add => ['gtg']},
13 => {:remove => ['ttg', 'ata', 'gtg']}}