|PyPI Version| |Build Status| |Coverage Status|
This plugin manages how data are stored in DCOR. There are two types of
files in DCOR:
- Resources uploaded by users, imported from figshare, or
imported from a data archive
- Ancillary files that are generated upon resource creation, such as
condensed DC data, preview images (see
ckanext-dc_view <>
This plugin implements:
Data storage management. All resources uploaded by a user are moved
and symlinks are created in /data/ckan-HOSTNAME/resources/RES/OUR/CEID
via a background job.
CKAN itself will not notice this. The idea is to have a filesystem overview
about the datasets of each user.
A backround job that uploads resources to S3 in after_resource_create
if the resources were uploaded via the legacy upload route.
A background job that backs up resources from S3 to local block storage
if the resources were uploaded via the S3 upload route.
Import datasets from figshare. Existing datasets from figshare are
downloaded to the /data/depots/figshare
directory and, upon resource
creation, symlinked there from /data/ckan-HOSTNAME/resources/RES/OUR/CEID
(Note that this is an exemption of the data storage management described
above). When running the following command, the "figshare-import" organization
is created and the datasets listed in figshare_dois.txt
are added to CKAN:
ckan import-figshare
CLI for symlinking datasets that have failed to symlink before:
ckan run-jobs-dcor-depot
CLI for appending a resource to a dataset
ckan append-resource /path/to/file dataset_id --delete-source
Please make sure that the necessary file permissions are given in /data
In 2023, it was decided that the huge block storage of DCOR
should be replaced with an S3-compatible object store, because block storage
does not scale well. This partially deprecates some of the commands above
which might be removed or modified to support object storage directly.
CLI for migrating data from block storage to an S3-compatible object storage
service. For this, the following configuration keys must be specified in
the ckan.ini
dcor_object_store.access_key_id = ACCESS_KEY_ID
dcor_object_store.secret_access_key = SECRET_ACCESS_KEY
dcor_object_store.endpoint_url = S3_ENDPOINT_URL
dcor_object_store.ssl_verify = true
The bucket name is by default defined by the circle ID. Resources
are stored in the "RES/OUR/CEID-SCHEME" in that bucket.
dcor_object_store.bucket_name = circle-{organization_id}
ckan dcor-migrate-resources-to-object-store
pip install ckanext-dcor_depot
Add this extension to the plugins and defaul_views in ckan.ini:
ckan.plugins = [...] dcor_depot
This plugin stores resources to /data
mkdir -p /data/depots/users-$(hostname)
chown -R www-data /data/depots/users-$(hostname)
If CKAN/DCOR is installed and setup for testing, this extension can
be tested with pytest:
pytest ckanext
Testing can also be done via vagrant in a virtualmachine using the
dcor-test <>
Make sure that vagrant
and virtualbox
are installed and run the
following commands in the root of this repository:
# Setup virtual machine using `Vagrantfile`
vagrant up
# Run the tests
vagrant ssh -- sudo bash /testing/
.. |PyPI Version| image::
.. |Build Status| image::
.. |Coverage Status| image::