Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
dna-parser is a Python library written in rust to encode (or perform feature extraction on) DNA/RNA sequences for machine learning.
To install dna-parser simply run:
pip install dna-parser
If there is no Python wheel available for your OS you can install Rust and re-install dna-parser which should now compile and your machine. Run the following command on Unix-like OS to install Rust:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
or see more options at https://www.rust-lang.org/tools/install.
import dna_parser
#load both metadata and sequence in tuples (metadata,sequences)
metadata_and_sequences= dna_parser.load_fasta("path/to/fasta/file")
#load sequence only
sequences= dna_parser.seq_from_fasta("path/to/fasta/file")
#load metadata only
metadata= dna_parser.metadata_from_fasta("path/to/fasta/file")
Currently only support ordinal encoding, onehot encoding, cross encoding and Term Frequency Inverse Document Frequency (TF-IDF).
Nucleotides are currently encoded as follow:
#returns a list of 1D numpy arrays representing the encoding
encoding= dna_parser.ordinal_encoding(sequences, pad_type, pad_length, n_jobs)
Function Arguments:
Nucleotides are currently encoded as follow:
#returns a list of 2D numpy arrays representing the encoding
encoding= dna_parser.onehot_encoding(sequences, pad_type, pad_length, n_jobs)
Function Arguments:
Nucleotides are currently encoded as follow:
#returns a list of 2D numpy arrays representing the encoding
encoding= dna_parser.cross_encoding(sequences, pad_type, pad_length, n_jobs)
Function Arguments:
Note that for this function, your sequences need to be split up in words (or k-mers) where each word is separated by a whitespace. To do so you can use the make_kmers function (see Other Functions section).
encoding= dna_parser.tfidf_encoding(corpus)
Function Arguments:
This function generates random dna, rna or amino acid sequences and returns them in a list.
sequences= dna_parser.random_seq(lenght, nb_of_seq, seq_type, n_jobs)
Function Arguments:
this function takes a string and returns a new one with withspaces inserted to form words of length k.
seq_k_mers= dna_parser.make_kmers(seq, k)
Function Arguments:
FAQs
Unknown package
We found that dna-parser demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.