Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

biolearns

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

biolearns

BioLearns: Computational Biology and Bioinformatics Toolbox in Python

0.0.62
PyPI

Maintainers: 1

biolearns

BioLearns: Computational Biology and Bioinformatics Toolbox in Python http://biolearns.medicine.iu.edu

Installation

From PyPI

pip install biolearns -U

Documentation and Tutorials

We select three examples listed below. For full list of tutorial, check our github wiki page:

Wiki

Disclaimer

Please note that this is a pre-release version of the BioLearns which is still undergoing final testing before its official release. The website, its software and all content found on it are provided on an "as is" and "as available" basis. BioLearns does not give any warranties, whether express or implied, as to the suitability or usability of the website, its software or any of its content. BioLearns will not be liable for any loss, whether such loss is direct, indirect, special or consequential, suffered by any party as a result of their use of the libraries or content. Any usage of the libraries is done at the user's own risk and the user will be solely responsible for any damage to any computer system or loss of data that results from such activities. Should you encounter any bugs, glitches, lack of functionality or other problems on the website, please let us know immediately so we can rectify these accordingly. Your help in this regard is greatly appreciated.

1. Read TCGA Data

Example: Read TCGA Breast invasive carcinoma (BRCA) data

Data is downloaded directly from https://gdac.broadinstitute.org/. The results here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.

from biolearns.dataset import TCGA

brca = TCGA('BRCA')
mRNAseq = brca.mRNAseq
clinical = brca.clinical

TCGA cancer table shortcut:

	Barcode	Cancer full name	Version
1	ACC	Adrenocortical carcinoma	2016_01_28
2	BLCA	Bladder urothelial carcinoma	2016_01_28
3	BRCA	Breast invasive carcinoma	2016_01_28
4	CESC	Cervical and endocervical cancers	2016_01_28
5	CHOL	Cholangiocarcinoma	2016_01_28
6	COAD	Colon adenocarcinoma	2016_01_28
7	COADREAD	Colorectal adenocarcinoma	2016_01_28
8	DLBC	Lymphoid Neoplasm Diffuse Large B-cell Lymphoma	2016_01_28
9	ESCA	Esophageal carcinoma	2016_01_28
...	...	...	...

2. Gene Co-expression Analysis

We firstly download and access the mRNAseq data.

from biolearns.dataset import TCGA

brca = TCGA('BRCA')
mRNAseq = brca.mRNAseq

mRNAseq data is noisy. We filter out 50% of genes with lowest mean values, and then filter out 50% remained genes with lowest variance values.

from biolearns.preprocessing import expression_filter
mRNAseq = expression_filter(mRNAseq, meanq = 0.5, varq = 0.5)

We then use lmQCM class to create an lmQCM object lobj.

The gene co-expression analysis is performed by simply call the fit() function.

from biolearns.coexpression import lmQCM

lobj = lmQCM(mRNAseq)
clusters, genes, eigengene_mat = lobj.fit()

3. Univariate survival analysis

We firstly download and access the mRNAseq data. Use breast cancer as an example.

from biolearns.dataset import TCGA

brca = TCGA('BRCA')
mRNAseq = brca.mRNAseq

We import logranktest from survival subpackage. Choose gene "ABLIM3" as the univariate input.

from biolearns.survival import logranktest

r = mRNAseq.loc['ABLIM3',].values

We find the intersection of univariate, time, and event data

bcd_m = [b[:12] for b in mRNAseq.columns]
bcd_p = [b[:12] for b in clinical.index]
bcd = np.intersect1d(bcd_m, bcd_p)

r = r[np.nonzero(np.in1d(bcd, bcd_m))[0]]
t = brca.overall_survival_time[np.nonzero(np.in1d(bcd, bcd_p))[0]]
e = brca.overall_survival_event[np.nonzero(np.in1d(bcd, bcd_p))[0]]

We perform log-rank test:

logrank_results, fig = logranktest(r[~np.isnan(t)], t[~np.isnan(t)], e[~np.isnan(t)])
test_statistic, p_value = logrank_results.test_statistic, logrank_results.p_value

The output figure looks like:

FAQs

What is biolearns?

Is biolearns well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

biolearns

biolearns

Installation

Documentation and Tutorials

Disclaimer

1. Read TCGA Data

Example: Read TCGA Breast invasive carcinoma (BRCA) data

TCGA cancer table shortcut:

2. Gene Co-expression Analysis

3. Univariate survival analysis

Related posts

Malicious npm Package Typosquats Popular TypeScript ESLint Plugin, Exfiltrates Data and Enables Remote Exploitation

Ultralytics PyPI Package Compromised Through GitHub Actions Cache Poisoning