
Research
/Security News
Malicious npm Packages Target WhatsApp Developers with Remote Kill Switch
Two npm packages masquerading as WhatsApp developer libraries include a kill switch that deletes all files if the phone number isn’t whitelisted.
Metagene Profiling Analysis and Visualization
This tool allows you to analyze metagene, the distribution of genomic features relative to gene regions (5'UTR, CDS, 3'UTR) and create publication-ready metagene profile plots.
Install metagene using pip:
pip install metagene
minimal python version requirement: 3.12
Basic metagene analysis using a built-in reference:
# Using built-in human genome reference (GRCh38)
metagene -i sites.tsv.gz -r GRCh38 --with-header -m 1,2,3 -w 5 \
-o output.tsv -s scores.tsv -p plot.png
Using a custom GTF file:
# Using custom GTF annotation
metagene -i sites.bed -g custom.gtf.gz -m 1,2,3 -w 5 \
-o output.tsv -s scores.tsv -p plot.png
from metagene import (
load_sites, load_reference, map_to_transcripts,
normalize_positions, plot_profile
)
# Load your genomic sites
sites_df = load_sites("sites.tsv.gz", with_header=True, meta_col_index=[0, 1, 2])
# Load reference genome annotation
reference_df = load_reference("GRCh38") # or load_gtf("custom.gtf.gz")
# Perform metagene analysis
annotated_df = map_to_transcripts(sites_df, reference_df)
gene_bins, gene_stats, gene_splits = normalize_positions(
annotated_df, split_strategy="median", bin_number=100
)
# Generate plot
plot_profile(gene_bins, gene_splits, "metagene_plot.png")
print(f"Analyzed {gene_bins['count'].sum()} sites")
print(f"Gene splits - 5'UTR: {gene_splits[0]:.3f}, CDS: {gene_splits[1]:.3f}, 3'UTR: {gene_splits[2]:.3f}")
print(f"Gene statistics - 5'UTR: {gene_stats['5UTR']}, CDS: {gene_stats['CDS']}, 3'UTR: {gene_stats['3UTR']}")
ref pos strand score pvalue
chr1 1000000 + 0.85 0.001
chr1 2000000 - 0.72 0.005
chr1 999999 1000000 score1 0.85 +
chr1 1999999 2000000 score2 0.72 -
-m/--meta-columns
to specify coordinate columns (1-based indexing)-w/--weight-columns
to specify score/weight columns-H/--with-header
if your file has a header lineMetagene includes pre-processed gene annotations for major model organisms:
Species | Assembly | Reference |
---|---|---|
Human | GRCh38/hg38 | GRCh38 , hg38 |
GRCh37/hg19 | GRCh37 , hg19 | |
Mouse | GRCm39/mm39 | GRCm39 , mm39 |
GRCm38/mm10 | GRCm38 , mm10 | |
mm9/NCBIM37 | mm9 , NCBIM37 | |
Arabidopsis | TAIR10 | TAIR10 |
Rice | IRGSP-1.0 | IRGSP-1.0 |
Model Organisms | Various | dm6 , ce11 , WBcel235 , sacCer3 , etc. |
List all available references:
metagene --list
This will show all 23+ available references organized by species:
Human:
GRCh37 - Human genome GRCh37 (Ensembl release 75)
GRCh38 - Human genome GRCh38 (Ensembl release 110)
hg19 - Human genome hg19 (UCSC 2021)
hg38 - Human genome hg38 (UCSC 2022)
Mouse:
GRCm38 - Mouse genome GRCm38 (Ensembl release 102)
GRCm39 - Mouse genome GRCm39 (Ensembl release 110)
mm10 - Mouse genome mm10 (UCSC 2021)
mm39 - Mouse genome mm39 (UCSC 2024)
mm9 - Mouse genome mm9 (UCSC 2020)
... and more
Download a specific reference:
metagene --download GRCh38
Download all references (requires ~10GB disk space):
metagene --download all
Usage: metagene [OPTIONS]
Run metagene analysis on genomic sites.
Options:
--version Show the version and exit.
-i, --input PATH Input file path (BED, GTF, TSV or CSV, etc.)
-o, --output PATH Output file path (TSV, CSV)
-s, --output-score PATH Output file for binned score statistics
-p, --output-figure PATH Output file for metagene plot
-r, --reference TEXT Built-in reference genome to use (e.g.,
GRCh38, GRCm39)
-g, --gtf PATH GTF/GFF file path for custom reference
--region Region to analyze (default: all)
-b, --bins INTEGER Number of bins for analysis (default: 100)
-H, --with-header Input file has header line
-S, --separator TEXT Separator for input file (default: tab)
-m, --meta-columns TEXT Input column indices (1-based) for genomic
coordinates. The columns should contain
Chromosome,Start,End,Strand or
Chromosome,Site,Strand
-w, --weight-columns TEXT Input column indices (1-based) for
weight/score values
-n, --weight-names TEXT Names for weight columns
--score-transform
Transform to apply to scores (default: none)
--normalize Normalize scores by transcript length
--list List all available built-in references and
exit
--download TEXT Download a specific reference (e.g., GRCh38)
or 'all' for all references
-h, --help Show this message and exit.
load_sites(file, with_header=False, meta_col_index=[0,1,2])
- Load genomic sitesload_reference(name)
- Load built-in reference genomeload_gtf(file)
- Load custom GTF annotationmap_to_transcripts(sites, reference)
- Annotate sites with gene informationnormalize_positions(annotated_sites, strategy="median")
- Normalize to relative positionsplot_profile(data, gene_splits, output_file)
- Generate metagene plotThe plot shows the distribution of genomic sites across normalized gene regions:
FAQs
Metagene Profiling Analysis and Visualization
We found that metagene demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
Two npm packages masquerading as WhatsApp developer libraries include a kill switch that deletes all files if the phone number isn’t whitelisted.
Research
/Security News
Socket uncovered 11 malicious Go packages using obfuscated loaders to fetch and execute second-stage payloads via C2 domains.
Security News
TC39 advances 11 JavaScript proposals, with two moving to Stage 4, bringing better math, binary APIs, and more features one step closer to the ECMAScript spec.