Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
See the full documentation here.
Table of contents generated with markdown-toc
This is a package to extract, process and analyze electrophysiology data recorded with Intan or OpenEphys recording systems. This package is customized to store experiment and analysis metadata for the BLECh Lab (Katz lab) @ Brandeis University, but can readily be used and customized for other labs.
Currently, blechpy is only developed and validated to work properly on Linux operating systems. It is possible to use blechpy on Mac, but some GUI features may not work properly. It is not possible to use blechpy on Windows.
Because blechpy depends on a very specific mix of package versions, it is required to install blechpy in a virtual environment. We highly recommend using miniconda to handle your virtual environments. You can download miniconda here: https://docs.conda.io/en/latest/miniconda.html
We recommend using a computer with at least 32gb of ram and a muti-core processor. The more cores and memory, the better. Memory usage scales with core usage--memory needed to run without overflow errors as you increase the number of cores used. It is possible to run memory-intensive functions with fewer cores to avoid overflow errors, but this will increase processing time. It is also possible to re-run memory-intensive functions after an overflow error, and the function will pick up where it left off.
Right now this pipeline is only compatible with recordings done with Intan's 'one file per channel' or 'one file per signal type' recordings settings.
Create a miniconda environment with:
conda create -n blechpy python==3.7.13
conda activate blechpy
Now you can install the package with pip:
pip install blechpy
Once you have installed blechpy, you will need to perform some steps to "activate" blechpy, whenever you want to use it.
conda activate blechpy
ipython
import blechpy
Now, you can use blechpy functions in your ipython console.
To update blechpy, open up a bash terminal and type:
conda activate blechpy #activate your blechpy virtual environment
pip install blechpy -U #install updated version of blechpy
If your operating system is Ubuntu version 20.XX LTS, "import blechpy" may throw a "segmentation fault" error. This is because numba version 0.48 available via pip-install is corrupted. You can fix this issue by reinstalling numba via conda, by entering the following command in your bash terminal:
conda install numba=0.48.0
blechpy handles experimental metadata using data_objects which are tied to a directory encompassing some level of data. Existing types of data_objects include:
With a brand new shiny dataset, the most basic recommended data extraction workflow would be:
dat = blechpy.dataset('/path/to/data/dir/') #create dataset object. Path to data dir should be your recording directory
# IMPORTANT: only run blechpy.dataset ONCE on a dataset, unless you want to overwrite the existing dataset and your preprocessing
# to load an existing dataset, use dat = blechpy.load_dataset('/path/to/data/dir/') instead
dat.initParams(data_quality='hp') # follow GUI prompts.
dat.extract_data() # Extracts raw data into HDF5 store
dat.create_trial_list() # Creates table of digital input triggers
dat.mark_dead_channels() # View traces and label electrodes as dead, or just pass list of dead channels
dat.common_average_reference() # Use common average referencing on data. Repalces raw with referenced data in HDF5 store
dat.detect_spikes() # Detect spikes in data. Replaces raw data with spike data in HDF5 store
dat.blech_clust_run(umap=True) # Cluster data using GMM
dat.sort_spikes(electrode_number) # Split, merge and label clusters as units. Follow GUI prompts. Perform this for every electrode
dat.post_sorting() #run this after you finish sorting all electrodes
dat.make_PSTH_plots() #optional: make PSTH plots for all units
dat.make_raster_plots() #optional: make raster plots for all units
It is common to get the following error after running the functions dat.detect_spikes()
or dat.blech_clust_run()
:
TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.
If you encounter this error, simply re-run the function which caused the error, and it will pick-up where it left off. Occasionally, you may have to do this several times before the function completes.
The reason for this error is that these functions multi-process across channels, but underlying libraries like scipy also parallelize their own operations. This makes it impossible for the program to know how much memory will be used to automatically constrain the number of multi-processes used.
blechpy.dataset() makes a NEW dataset or to OVERWRITES an existing dataset. DO NOT use it on an existing dataset unless you want to overwrite the existing dataset and lose you preprocessing progress with it.
dat = blechpy.dataset('path/to/recording/directory') # replace quoted text with the filepath to the folder where your recording files are
# or
dat = blechpy.dataset() # for user interface to select directory
This will create a new dataset object and setup basic file paths. You should only do this when starting data processing for the first time. If you use it on a processed dataset, it will get overwritten.
If you already have a dataset and want to pick up where you left off, use blechpy.load_dataset() instead of blechpy.dataset().
dat = blechpy.load_dataset('/path/to/recording/directory') # create dataset object
# or
dat = blechpy.load_dataset() # for user interface to select directory
# or
dat = blechpy.load_dataset('path/to/dataset/save/file.p')
dat.initParams()
Initalizes all analysis parameters with a series of prompts. See prompts for optional keyword params. Primarily setups parameters for:
Initial parameters are pulled from default json files in the dio subpackage. Parameters for a dataset are written to json files in a parameters folder in the recording directory.
dat.initParams(data_quality='noisy') # alternative: less strict clustering parameters
dat.initParams(car_keyword='2site_OE64') # automatically map channels to hirose-connector 64ch OEPS EIB in 2-site implantation
dat.initParams(car_keyword='bilateral64') # automatically map channels to omnetics-connector 64ch EIB in 2-site implantation
dat.initParams(shell=True) # alternative: bypass GUI interface in favor of shell interface, useful if working over SSH or GUI is broken
#remember that you can chain any combination of valid keyword arguments together, eg.:
dat.initParams(data_quality='hp', car_keyword='bilateral64', shell=True)
dat.mark_dead_channels() # opens GUI to view traces and label dead channels
Marking dead channels is critical for good common average referencing, since dead channels typically have a signal that differs a lot from the "true" average voltage at the electrode tips.
If you already know your dead channels a-priori, you can pass them to mark_dead_channels() as a list of integers:
dat.mark_dead_channels([dead channel indices]) # dead channel indices eg. : [1,2,3]
blech_clust_run's keywords can change the clustering algorithm and/or parameters
dat.blech_clust_run(data_quality='noisy') # alternative: re-run clustering with less strict parameters
dat.blech_clust_run(umap=True) # alternative: cluster with UMAP instead of GMM, improves clustering
dat.blech_clust_run() # default uses PCA instead of UMAP, which is faster, but lower quality clustering
If you want to move a dataset folder, it is critical you perform the following steps:
new_directory = 'path/to/new/dataset/folder' # You can paste the directory by right clicking and selecting 'paste filename'
dat = blechpy.load_dataset(new_directory) # load the dataset
dat._change_root(new_directory) # change the root directory of the dataset to the new directory
dat.save() # save the new directory to the dataset file
dat.processing_status
Can provide an overview of basic data extraction and processing steps that need to be taken.
Experiments can be easily viewed wih: print(dat)
A summary can also be exported to a text with: dat.export_to_text()
dat = blechpy.port_in_dataset()
# or
dat = blechpy.port_in_dataset('/path/to/recording/directory')
exp = blechpy.experiment('/path/to/dir/encasing/recordings')
# or
exp = blechpy.experiment()
This will initalize an experiment with all recording folders within the chosen directory.
exp.add_recording('/path/to/new/recording/dir/') # Add recording
exp.remove_recording('rec_label') # remove a recording dir
Recordings are assigned labels when added to the experiment that can be used to easily reference exerpiments.
exp.detect_held_units()
Uses raw waveforms from sorted units to determine if units can be confidently classified as "held". Results are stored in exp.held_units as a pandas DataFrame. This also creates plots and exports data to a created directory: /path/to/experiment/experiment-name_analysis
The blechpy.analysis
module has a lot of useful tools for analyzing your data.
Most notable is the blechpy.analysis.poissonHMM
module which will allow fitting of the HMM models to your data. See tutorials.
FAQs
Package for exrtacting, processing and analyzing Intan and OpenEphys data
We found that blechpy demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.