
Security News
Official Go SDK for MCP in Development, Stable Release Expected in August
The official Go SDK for the Model Context Protocol is in development, with a stable, production-ready release expected by August 2025.
[!CAUTION] You may want to add pip as alias for pip3
On your terminal;
echo "alias pip='pip3'" >> ~/.bashrc
source ~/.bashrc
Or simply use pip3
instead of pip
pip install probe_design -U
This adds prb
(short for probe design) as a shell command.
On your terminal;
pip install git+https://github.com/ggirelli/oligo-melting.git
[!NOTE] nHUSH, HUSH and escafish are private repositories
Install the dev branch of nHUSH
Install HUSH
Install escafish
Install OligoArrayAux
Get the genomic coordinates of the regions of interest
Get the reference genome
Get the transcripts of interest
Get the reference transcriptome
For DNA probes, the reference genome will be used both to extract the sequences of interest and to test probe candidates for homology. If different genomes need to be used, follow RNA steps and provide the regions of interest directly.
For combined DNA-RNA FISH, the probe sets should be designed with an homology check against both genome and transcriptome.
All the commands below assume you are starting from your project directory
To make a project directory and change directory to project directory
mkdir <project_name>
cd <project_name>
Inside the project dirctory.
prb makedirs
This will create data
directory and its subdirectories data/rois
and data/ref
.
data/
folder should only contain
data/rois/
and data/ref/
(and possibly data/blacklist/
, see 6.). If more folders are included, consider making a back-up or simply removing them.[!CAUTION]
- your region of interests file MUST be named
all_regions.tsv
all_regions.tsv
MUST follow the EXAMPLE format.all_regions.tsv
MUST be placed withindata/rois
folder.
data/rois/all_regions.tsv
For CHM13 T2T
prb get_T2T
options
-p: prefix for the chromosomes ;default: CHM13.T2T names will prefix.chromosome.ID.fa where ID stands for chromosome ID i.e., 1-22+X,Y,M
For GRCh38
prb get_GRC -split
usage: prb get_GRC [-h] [-s {homo_sapiens,mus_musculus}] [-b BUILD] [-r RELEASE] [-d DIR] [-f FILENAME] [-k] [-split]
download ensemble genome
options:
-h, --help
------> show this help message and exit
-s {homo_sapiens,mus_musculus}, --species {homo_sapiens,mus_musculus}
-b BUILD, --build BUILD
------> the build number of the genome
-r RELEASE, --release RELEASE
------> release number of the build
-d DIR, --dir DIR destination directory
-f FILENAME, --filename FILENAME
------> give a specific name to the downloaded file
-k, --keep
------> whether to keep gzip files
-split
------> whether to split into chromosomes
prb get_oligos DNA|RNA [optional: applyGCfilter 0|1]
# Example:
prb get_oligos DNA 1
[!NOTE] If indicating
RNA
, the module will assume that the transcript / region sequences are already present in thedata/regions
folder. Default: `DNA.
L
) at
once, can be sped up by testing shorter sublength oligos (of length
l). -m
number of mismatches to test for (always use 1 when running
sublength); -t
number of threads, -i
comb size[!CAUTION] Make sure your Length (-L) here matches with the Length in your all_regions.tsv file
prb run_nHUSH -d RNA -L 35 -m 5 -t 40 -i 14
prb run_nHUSH -d DNA -L 40 -l 21 -m 3 -t 40 -i 14
prb run_nHUSH -d {DNA|RNA} -L {length} -l (optional){sublength} -m {number of mismatches} -t {threads} i {comb size}
prb unfinished_HUSH
prb reform_hush_combined DNA|RNA|-RNA length sublength until
e.g., prb reform_hush_combined DNA 40 21 3
(until
denotes the same number as specified after -m
when running nHUSH).
prb melt_secs_parallel (optional DNA(ref) | RNA(rev. compl))
e.g., prb melt_secs_parallel DNA
[!NOTE] This only needs to be run once per reference genome if not using any exclusion regions! Just save the blacklist folder between runs.
prb generate_blacklist -L 40 -c 100
L: oligo length
c: min abundance to be included in oligo black list
prb build-db_BL -f q_bl -m 32 -i 6 -L 40 -c 100 -d 8 -T 72
m: Maximum length of a consecutive match. Default: 24
i: Maximum length of a consecutive homopolymer. Default: 6
All oligos with a longer consecutive match or homopolymer are stricly excluded.
L: oligo length
c: min number of occurrences for an oligo to be counted in black list
(should match settings used in 6.)
d: min Hamming distance to an oligo in the blacklist for exclusion
T: Target melting temperature. Default: 72C
prb cycling_query -s DNA -L 40 -m 8 -c 100 -t 40 -g 500 -greedy
[optional: -greedy. Speed > quality] [optional: -start 20 -end 100 -step 5]
To sweep different oligo numbers, otherwise uses the oligo counts provided in ./rois/all_regions.tsv
[optional: -stepdown 10]
Number of oligos to decrease probe size with every iteration that does not find enough oligos. Default: 1
Cycling query which generate probe candidates, then checks the resulting oligos using HUSH, removes inacceptable oligos and generate probes again.
If enough oligos cannot be found, design probes with fewer oligos, decreasing with stepdown
at each step.
prb summarize_probes_final
Some visual elements can be obtained using the following notebooks (TODO!):
prb plot_probe_candidates
prb plot_oligos
In this alternative, the region (along with any user-indicated repeats) is masked out from the reference genome used by nHUSH. This way, repeated oligos that are specific for the ROI can be included in the final probe.
prb makedirs
[!CAUTION]
- your region of interests file MUST be named
all_regions.tsv
all_regions.tsv
MUST follow the EXAMPLE format.all_regions.tsv
MUST be placed withindata/rois
folder.
data/rois/
and data/ref/
, the pipeline requires an additional
data/exclude/
folder containing BED files with the coordinates of sections
to mask out when running HUSH for each ROI.For CHM13 T2T (advised for repetetive regions)
prb get_T2T
options
-p: prefix for the chromosomes ;default: CHM13.T2T names will prefix.chromosome.ID.fa where ID stands for chromosome ID i.e., 1-22+X,Y,M
prb generate_exclude
# (from Pipeline/)
prb get_oligos DNA|RNA [optional: applyGCfilter 0|1]
# Example:
prb get_oligos DNA
If indicating RNA
, the module will assume that the transcript / region
sequences are already present in the data/regions
folder. Default: `DNA.
prb exclude_region
prb generate_blacklist -L 40 -c 100
Needs to be re-run everytime when using exclusion masks. L: oligo length; c: min abundance to be included in oligo black list
L
) at
once, can be sped up by testing shorter sublength oligos (of length
l). -m
number of mismatches to test for (minimum 1 for sublength;
more gives better information but takes longer time);
-t
number of threads, -i
comb sizeSublength:
prb run_nHUSH_excl -d DNA -L 40 -l 21 -m 3 -t 40 -i 14
prb run_nHUSH_excl -d {DNA|RNA} -L {length} -l (optional){sublength} -m {number of mismatches} -t {threads} i {comb size}
Note the _excl
specific to the exclusion mode.
In case nHUSH is interrupted before completion, run before continuing:
prb unfinished_HUSH
# Format:
prb reform_hush_combined DNA|RNA|-RNA length sublength until
# Example:
prb reform_hush_combined DNA 40 21 3
(until
denotes the same number as specified after -m
when running nHUSH).
prb melt_secs_parallel (optional DNA(ref) / RNA(rev. compl))
e.g., prb melt_secs_parallel DNA
Recommended:
prb build-db_BL -f q_bl -m 32 -i 6 -L 40 -c 100 -d 8 -T 72
f: score function
d: max Hamming distance to blacklist that is excluded
L: oligo length
c: min abundance to be included in oligo blacklist
i: max identical consecutive base pairs,
T: target temperature
m: max length of consecutive off-target match
prb cycling_query -s DNA -L 40 -m 8 -c 100 -t 40 -g 500 -stepdown 50 -greedy -excl
[optional: -greedy. Speed > quality] [optional: -start 20 -end 100 -step 5]
To sweep different oligo numbers, otherwise uses the oligo counts provided in ./rois/all_regions.tsv
[optional: -stepdown 10]
Number of oligos to decrease probe size with every iteration that does not find enough oligos. Default: 1
Cycling query which generate probe candidates, then checks the resulting oligos using HUSH, removes inacceptable oligos and generate probes again.
If enough oligos cannot be found, design probes with fewer oligos, decreasing with stepdown
at each step.
prb summarize_probes_final
FAQs
Probe design for FISH by Quentin Verron
We found that probe-design demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The official Go SDK for the Model Context Protocol is in development, with a stable, production-ready release expected by August 2025.
Security News
New research reveals that LLMs often fake understanding, passing benchmarks but failing to apply concepts or stay internally consistent.
Security News
Django has updated its security policies to reject AI-generated vulnerability reports that include fabricated or unverifiable content.