Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

pytrimal

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

pytrimal

Cython bindings and Python interface to trimAl, a tool for automated alignment trimming.

  • 0.8.0
  • PyPI
  • Socket score

Maintainers
1

🐍✂️ PytrimAl Stars

Cython bindings and Python interface to trimAl, a tool for automated alignment trimming. Now with SIMD!

Actions Coverage License PyPI Bioconda AUR Wheel Python Versions Python Implementations Source Mirror Issues Docs Changelog Downloads

⚠️ This package is based on the release candidate of trimAl 2.0, and results may not be consistent across versions or with the trimAl 1.4 results.

🗺️ Overview

PytrimAl is a Python module that provides bindings to trimAl using Cython. It implements a user-friendly, Pythonic interface to use one of the different trimming methods from trimAl and access results directly. It interacts with the trimAl internals, which has the following advantages:

📋 Roadmap

The following features are available or considered for implementation:

  • automatic trimming: Support for trimming alignments using one of the automatic heuristics implemented in trimAl.
  • manual trimming: Support for trimming alignments using manually defined conservation and gap thresholds for each residue position.
  • overlap trimming: Trimming sequences using residue and sequence overlaps to exclude regions with minimal conservation.
  • representative trimming: Select only representative sequences from the alignment, either using a fixed number, or a maximum identity threshold.
  • alignment loading from disk: Load an alignment from disk given a filename.
  • alignment loading from a file-like object: Load an alignment from a Python file object instead of a file on the local filesystem.
  • aligment creation from Python: Create an alignment from a collection of sequences stored in Python strings.
  • alignment formatting to disk: Write an alignment to a file given a filename in one of the supported file formats.
  • alignment formatting to a file-like object: Write an alignment to a file-like object in one of the supported file formats.
  • reverse-translation: Back-translate a protein alignment to align the sequences in genomic space.
  • alternative similarity matrix: Specify an alternative similarity matrix for the alignment (instead of BLOSUM62).
  • similarity matrix creation: Create a similarity matrix from scratch from Python code.
  • windows for manual methods: Use a sliding window for computing statistics in manual methods.

🔧 Installing

PytrimAl is available for all modern versions (3.6+), with no external dependencies.

It can be installed directly from PyPI, which hosts some pre-built wheels for the x86-64 architecture (Linux/OSX) and the Aarch64 architecture (Linux only), as well as the code required to compile from source with Cython:

$ pip install pytrimal

Otherwise, pytrimal is also available as a Bioconda package:

$ conda install -c bioconda pytrimal

💡 Example

Let's load an Alignment from a file on the disk, and use the strictplus method to trim it, before printing the TrimmedAlignment as a Clustal block:

from pytrimal import Alignment, AutomaticTrimmer

ali = Alignment.load("pytrimal/tests/data/example.001.AA.clw")
trimmer = AutomaticTrimmer(method="strictplus")

trimmed = trimmer.trim(ali)
for name, seq in zip(trimmed.names, trimmed.sequences):
    print(name.decode().rjust(6), seq)

This should output the following:

Sp8    GIVLVWLFPWNGLQIHMMGII
Sp10   VIMLEWFFAWLGLEINMMVII
Sp26   GLFLAAANAWLGLEINMMAQI
Sp6    GIYLSWYLAWLGLEINMMAII
Sp17   GFLLTWFQLWQGLDLNKMPVF
Sp33   GLHMAWFQAWGGLEINKQAIL

You can then use the dump method to write the trimmed alignment to a file or file-like object. For instance, save the results in PIR format to a file named example.trimmed.pir:

trimmed.dump("example.trimmed.pir", format="pir")

🧶 Thread-safety

Trimmer objects are thread-safe, and the trim method is re-entrant. This means you can batch-process alignments in parallel using a ThreadPool with a single trimmer object:

import glob
import multiprocessing.pool
from pytrimal import Alignment, AutomaticTrimmer

trimmer = AutomaticTrimmer()
alignments = map(Alignment.load, glob.iglob("pytrimal/tests/data/*.fasta"))

with multiprocessing.pool.ThreadPool() as pool:
    trimmed_alignments = pool.map(trimmer.trim, alignments)

⏱️ Benchmarks

Benchmarks were run on a i7-10710U CPU @ 1.10GHz, using a single core to time the computation of several statistics, on a variable number of sequences from example.014.AA.EggNOG.COG0591.fasta, an alignment of 3583 sequences and 7287 columns.

Benchmarks

Each graph measures the computation time of a single trimAl statistic (see the Statistics page of the online documentation for more information.)

The None curve shows the time using the internal trimAl 2.0 code, the Generic curve shows a generic C implementation with some more optimizations, and the SSE curve shows the time spent using a dedicated class with SIMD implementations of the statistic computation.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

📋 Changelog

This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.

⚖️ License

This library is provided under the GNU General Public License v3.0. trimAl is developed by the trimAl team and is distributed under the terms of the GPLv3 as well. See vendor/trimal/LICENSE for more information.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the trimAl authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc