Security News
Weekly Downloads Now Available in npm Package Search Results
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.
Python package for the high-throughput nontargeted metabolite fingerprinting of nominal mass direct injection mass spectrometry.
Python package for the high-throughput nontargeted metabolite fingerprinting of nominal mass direct injection mass spectrometry directly from mzML files.
DIMEpy requires Python 3+ and is unfortunately not compatible with Python 2. If you are still using Python 2, a clever workaround is to install Python 3 and use that instead.
You can install it through pypi
using pip
:
pip install dimepy
If you want the 'bleeding edge' version this, you can also install directly from this repository using git
- but beware of dragons:
pip install git+https://www.github.com/AberystwythSystemsBiology/DIMEpy
To use the package, type the following into your Python console:
>>> import dimepy
At the moment, this pipeline only supports mzML files. You can easily convert proprietary formats to mzML using ProteoWizard.
If you are only going to load in a single file for fingerprint matrix estimation, then just create a new spectrum object. If the sample belongs to a characteristic, it is recommend that you also pass it through when instantiating a new Spectrum
object.
>>> filepath = "/file/to/file.mzML"
>>> spec = dimepy.Spectrum(filepath, identifier="example", stratification="class_one")
/file/to/file.mzML
By default the Spectrum object doesn't set a snr estimator. It is strongly recommended that you set a signal to noise estimation method when instantiating the Spectrum object.
If your experimental protocol makes use of mixed-polarity scanning, then please ensure that you limit the scan ranges to best match what polarity you're interested in analysing:
>>> spec.limit_polarity("negative")
If you are using FIE-MS it is strongly recommended that you use just the infusion profile to generate your mass spectrum. For example, if your scan profiles look like this:
| _
T | / \
I | / \_
C |_____/ \_________________
0 0.5 1 1.5 2 [min]
Then it is fair to assume that the infusion occured during the scans ranging from 30 seconds to 1 minute. The limit_infusion()
method does this by estimating the median absolute deviation (MAD) of total ion counts (TIC) before limiting the profile to the range between the time range in which whatever multiple of MAD has been estimated:
>>> spec.limit_infusion(2) # 2 times the MAD.
Now, we are free to load in the scans to generate a base mass_spectrum:
>>> spec.load_scans()
You should now be able to access the generated mass spectrum using the masses
and intensities
attributes:
>>> spec.masses
array([ ... ])
>>> spec.intensities
array([ ... ])
A more realistic pipeline would be to use multiple mass-spectrum files. This is where things really start to get interesting. The SpectrumList
object facilitates this through the use of the append
method:
>>> speclist = dimepy.SpectrumList()
>>> speclist.append(spec)
You can make use of an iterator to recursively generate Spectrum
objects, or do it manually if you want.
If you're only using this pipeline to extract mass spectrum for Metabolanalyst, then you can now simply call the _to_csv
method:
>>> speclist.to_csv("/path/to/output.csv", output_type="metaboanalyst")
That being said, this pipeline contains many of the preprocessing methods found in Metaboanalyst - so it may be easier for you to just use ours.
As a diagnostic measure, the TIC can provide an estimation of factos that may adversely affect the overal intensity count of a run. As a rule, it is common to remove spectrum in which the TIC deviates 2/3 times from the median-absolute deviation. We can do this by calling the detect_outliers
method:
>>> speclist.detect_outliers(thresh = 2, verbose=True)
Detected Outliers: outlier_one;outlier_two
A common first step in the analysis of mass-spectrometry data is to bin the data to a given mass-to-ion value. To do this for all Spectrum
held within our SpectrumList
object, simply apply the bin
method:
>>> speclist.bin(0.25) # binning our data to a bin width of 0.25 m/z
In FIE-MS null values should concern no more than 3% of the total number of identified bins. However, imputation is required to streamline the analysis process (as most multivariate techniques are unable to accomodate missing data points). To perform value imputation, just use value_imputate
:
>>> speclist.value_imputate()
Now transforming and normalisating the the spectrum objects in an samples independent fashion can be done using the following:
>>> speclist.transform()
>>> speclist.normalise()
Once completed, you are now free to export the data to a data matrix:
>>> speclist.to_csv("/path/to/proc_metabo.csv", output_type="matrix")
This should give you something akin to:
Sample ID | M0 | M1 | M2 | M3 | ... |
---|---|---|---|---|---|
Sample 1 | 213 | 634 | 3213 | 546 | ... |
Sample 2 | 132 | 34 | 713 | 6546 | ... |
Sample 3 | 1337 | 42 | 69 | 420 | ... |
Please report all bugs or feature suggestions to the issues tracker. Please do not email me directly as I'm struggling to keep track of what needs to be fixed.
We welcome all sorts of contribution, so please be as candid as you want(!)
Documentation for the project can be found on its readthedocs page.
DIMEpy is licensed under the GNU General Public License v3.0.
FAQs
Python package for the high-throughput nontargeted metabolite fingerprinting of nominal mass direct injection mass spectrometry.
We found that DIMEpy demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.
Security News
A Stanford study reveals 9.5% of engineers contribute almost nothing, costing tech $90B annually, with remote work fueling the rise of "ghost engineers."
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.