Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
This repository attempts to implement a neural net that leverages the transformer architecture to predict peptide properties (retention time and fragmentation).
Currently the documentation lives here: https://jspaezp.github.io/elfragmentador/ Please check out The Quickstart guide for usage instructions.
Because we can... Just kidding
The transformer architecture provides several benefits over the standard approach on fragment prediction (LSTM/RNN). On the training side it allows the parallel computation of whole sequences, whilst in LSTMs one element has to be passed at a time. In addition it gives the model itself a better chance to study the direct interactions between the elements that are being passed.
On the other hand, it allows a much better interpretability of the model, since the 'self-attention' can be visualized on the input and in that way see what the model is focusing on while generating the prediction.
Many of the elements from this project are actually a combination of the principles shown in the Prosit paper and the Skyline poster on some of the elements to encode the peptides and the output fragment ions.
On the transformer side of things I must admit that many of the elements of this project are derived from DETR: End to end detection using transformers in particular the trainable embeddings as an input for the decoder and some of the concepts discussed about it on Yannic Kilcher's Youtube channel (which I highly recommend).
Two main reasons ... it translates to 'The fragmenter' in spanish and the project intends to predict fragmentation. On the other hand ... The name was free in pypi.
You can check how fast the model is in you specific system. Right now the CLI tests the speed only on CPU (the model can be run in GPU).
Here I will predict the fasta file for SARS-COV2
poetry run elfragmentador predict --fasta tests/data/fasta/uniprot-proteome_UP000464024_reviewed_yes.fasta --nce 32 --charges 2 --missed_cleavages 0 --min_length 20 --out foo.dlib
...
99%|█████████▉| 1701/1721 [00:14<00:00, 118.30it/s]
...
~100 predictions per second including pre-post processing and writting the enciclopeDIA library. On a GPU it is closer to ~1000 preds/sec
I have explored many variations on the model but currently the one distributed is only ~4mb. Models up to 200mb have been tried and they don't really give a big improvement in performance.
elfragmentador_evaluate {your_checkpoint.ckpt} {your_splib.sptxt}
FAQs
Predicts peptide fragmentations using transformers
We found that elfragmentador demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.