Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
[!WARNING]
This package is in its early development stages. Its functionality and API will change.
Stay tuned for the updates and documentation, and please share your feedback about it by opening issues in this repository, or by starting a discussion in IDC User forum.
idc-index
is a Python package that enables basic operations for working with
NCI Imaging Data Commons (IDC):
Install the latest version of the package.
$ pip install --upgrade idc-index
Instantiate IDCClient
, which provides the interface for main operations.
from idc_index import IDCClient
client = IDCClient.client()
You can use IDC Portal to
browse collections, cases, studies and series, copy their identifiers and
download the corresponding files using idc-index
helper functions.
You can try this out with the rider_pilot
collection, which is just 10.5 GB in
size:
client.download_from_selection(collection_id="rider_pilot", downloadDir=".")
... or run queries against the "mini" index of Imaging Data Commons data, and download images that match your selection criteria! The following will select all Magnetic Resonance (MR) series, and will download the first 10.
from idc_index import index
client = index.IDCClient()
query = """
SELECT
SeriesInstanceUID
FROM
index
WHERE
Modality = 'MR'
"""
selection_df = client.sql_query(query)
client.download_from_selection(
seriesInstanceUID=list(selection_df["SeriesInstanceUID"].values[:10]),
downloadDir=".",
)
indices
of idc-index
idc-index
is named this way because it wraps indices of IDC data: tables
containing the most important metadata attributes describing the files available
in IDC. The main metadata index is available in the index
variable (which is a
pandas DataFrame
) of IDCClient
. Additional index tables such as the
clinical_index
contain non-DICOM clinical data or slide microscopy specific
tables (indicated by the prefix sm
) include metadata attributes specific to
slide microscopy images. A description of available attributes for all indices
can be found here.
Please check out
this tutorial notebook
for the introduction into using idc-index
.
idc-index
for search and download of IDC
dataThis software is maintained by the IDC team, which has been funded in whole or in part with Federal funds from the NCI, NIH, under task order no. HHSN26110071 under contract no. HHSN261201500003l.
If this package helped your research, we would appreciate if you could cite IDC paper below.
Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National Cancer Institute Imaging Data Commons: Toward Transparency, Reproducibility, and Scalability in Imaging Artificial Intelligence. RadioGraphics (2023). https://doi.org/10.1148/rg.230180
FAQs
Package to query and download data from an index of ImagingDataCommons
We found that idc-index demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.