
Security News
Crates.io Implements Trusted Publishing Support
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
A package to download, load, and process multiple benchmark multi-omic drug response datasets
There is a recent explosion of deep learning algorithms that to tackle the computational problem of predicting drug treatment outcome from baseline molecular measurements. To support this,we have built a benchmark dataset that harmonizes diverse datasets to better assess algorithm performance.
This package collects diverse sets of paired molecular datasets with corresponding drug sensitivity data. All data here is reprocessed and standardized so it can be easily used as a benchmark dataset for the This repository leverages existing datasets to collect the data required for deep learning model development. Since each deep learning model requires distinct data capabilities, the goal of this repository is to collect and format all data into a schema that can be leveraged for existing models.
The goal of this repository is two-fold: First, it aims to collate and standardize the data for the broader community. This requires running a series of scripts to build and append to a standardized data model. Second, it has a series of scripts that pull from the data model to create model-specific data files that can be run by the data infrastructure.
For the access to the latest version of CoderData, please visit our documentation site which provides access to Figshare and instructions for using the Python package to download the data.
All coderdata files are in text format - either comma delimited or tab delimited (depending on data type). Each dataset can be evaluated individually according to the CoderData schema that is maintained in LinkML and can be udpated via a commit to the repository. For more details, please see the schema description.
The build process can be found in our build directory. Here you can follow the instructions to build your own local copy of the data on your machine.
We have standardized the build process so an additional dataset can be built locally or as part of the next version of coder. Here are the steps to follow:
First visit the build directory and ensure you can build a local copy of CoderData.
Checkout this repository and create a subdirectory of the build directory with your own build files.
Develop your scripts to build the data files according to our LinkML Schema. This will require collecting the following metadata:
genes.csv
fileYou can validate each file by using the linkML validator together with our schema file.
You can use the following scripts as part of your build process:
Drug
table.shell script | arguments | description |
---|---|---|
build_samples.sh | [latest_samples] | Latest version of samples generated by coderdata build |
build_omics.sh | [gene file] [samplefile] | This includes the genes.csv that was generated in the original build as well as the sample file generated above. |
build_drugs.sh | [drugfile1,drugfile2,...] | This includes a comma-delimited list of all drugs files generated from previous build |
build_exp.sh | [samplfile ] [drugfile] | sample file and drug file generated by previous scripts |
Put the Docker container file inside the Docker
directory with the name
Dockerfile.[datasetname]
.
Run build_all.py
from the root directory, which should now add in
your Dockerfile in the mix and call the scripts in your Docker
container to build the files.
FAQs
A package to download, load, and process multiple benchmark multi-omic drug response datasets
We found that coderdata demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 3 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
Research
/Security News
Undocumented protestware found in 28 npm packages disrupts UI for Russian-language users visiting Russian and Belarusian domains.
Research
/Security News
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.