KUDAF Datasource CLI Tools
This is a set of Command Line Interface (CLI) tools to facilitate the technical tasks requirered from Data Providers that want to make their data available on the KUDAF data-sharing platform.
It was developed by Sikt - Kunnskapssektorens tjenesteleverandør under the KUDAF initiative to enable a Data Producer to make small-file data available on the KUDAF data-sharing platform.
The CLI can create the following Kudaf Data Source components:
- Variables Metadata, and /or
- REST API Datasource back-end (including variables metadata and possibly even the data files themselves)
About KUDAF
KUDAF - Kunnskapssektorens datafelleskap skal sørge for tryggere, enklere og bedre deling av data. Les mer om KUDAF.
High-level workflow for Data Source administrators (Beta version)
Fra dataprodusent til datatilbyder
Feide Kundeportal - Datadeling (Nosrk)
Data Sharing and the Feide Customer Portal
Feide - Data Provider How-to
Local installation instructions (Linux/Mac)
Make sure Python3 is installed on your computer (versions from 3.8 up to 3.11 should work fine)
$ python3 --version
Navigate to the folder chosen to contain this project
$ cd path/to/desired/folder
Create a Python virtual environment and activate it
$ python3 -m venv .venv
This created the virtualenv under the hidden folder .venv
Activate it with:
$ source .venv/bin/activate
Install Kudaf Datasource Tools and other required Python packages
$ pip install kudaf_datasource_tools
Creating a YAML configuration file
Click here for a basic YAML syntax tutorial
Example YAML configuration file
The following file is included in the package and can be found in the datasource_tools/config
folder:
config_example.yaml
---
projectInfo:
organization: "my-short-organizations-name"
datasourceName: "my-FeideKundeportal-Datasource-name"
datasourceId: "my-FeideKundeportal-Datasource-UUID"
dataFiles:
- dataFile: &mydatafile
fileNameExt: mydatafile.csv
csvParseDelimiter: ";"
fileDirectory: /path/to/my/datafiles/directory
unitTypes:
- MIN_ENHETSTYPE1: &min_enhetstype1
shortName: MIN_ENHETSTYPE1
name: Kort identifikasjonsetikett
description: Detaljert beskrivelse av denne enhetstypen
dataType: LONG
- MIN_ENHETSTYPE2: &min_enhetstype2
shortName: MIN_ENHETSTYPE2
name: Kort identifikasjonsetikett
description: Detaljert beskrivelse av denne enhetstypen
dataType: LONG
variables:
- name: VARIABELENS_NAVN
temporalityType: FIXED
dataRetrievalUrl: https://my-datasource-api.no/api/v1/variables/VARIABELENS_NAVN
sensitivityLevel: NONPUBLIC
populationDescription:
- Beskrivelse av populasjonen som denne variabelen måler
spatialCoverageDescription:
- Norge
- Annen geografisk beskrivelse som gjelder disse dataene
subjectFields:
- Temaer/konsepter/begreper som disse dataene handler om
identifierVariables:
- unitType: *min_enhetstype1
measureVariables:
- label: Kort etikett på hva denne variabelen måler/viser
description: Detaljert beskrivelse av hva denne variabelen måler/viser
dataType: STRING
dataMapping:
dataFile: *mydatafile
identifierColumns:
- Min_Identificatorkolonne
measureColumns:
- Min_Målkolonnen
- name: VARIABELENS_NAVN_ACCUM
temporalityType: ACCUMULATED
dataRetrievalUrl: https://my-datasource-api.no/api/v1/variables/VARIABELENS_NAVN
sensitivityLevel: NONPUBLIC
populationDescription:
- Beskrivelse av populasjonen som denne variabelen måler
spatialCoverageDescription:
- Norge
- Annen geografisk beskrivelse som gjelder disse dataene
subjectFields:
- Temaer/konsepter/begreper som disse dataene handler om
identifierVariables:
- unitType: *min_enhetstype2
measureVariables:
- label: Kort etikett på hva denne variabelen måler/viser
description: Detaljert beskrivelse av hva denne variabelen måler/viser
dataType: LONG
dataMapping:
dataFile: *mydatafile
identifierColumns:
- Min_Identificatorkolonne
measureColumns:
- Min_Målkolonnen
measureColumnsAccumulated: False
attributeColumns:
- Start_Time
- End_Time
- name: NØKKELVAR_ID-NØKKEL_MÅLE-NOKKEL
temporalityType: FIXED
dataRetrievalUrl: https://my-datasource-api.no/api/v1/variables/NØKKELVAR_ID-NØKKEL_MÅLE-NOKKEL
sensitivityLevel: PUBLIC
populationDescription:
- Beskrivelse av populasjonen som denne variabelen måler
spatialCoverageDescription:
- Norge
- Annen geografisk beskrivelse som gjelder disse dataene
subjectFields:
- Temaer/konsepter/begreper som disse dataene handler om
identifierVariables:
- unitType: *min_enhetstype1
measureVariables:
- label: Kort etikett på hva denne variabelen måler/viser
description: Detaljert beskrivelse av hva denne variabelen måler/viser
unitType: *min_enhetstype2
dataType: LONG
dataMapping:
dataFile: *mydatafile
identifierColumns:
- Min_Identificatorkolonne
measureColumns:
- Min_Målkolonnen
...
Kudaf Datasource Tools CLI operation
Navigate to the project directory and activate the virtual environment (only if not already activated):
$ source .venv/bin/activate
The kudaf-generate
command should be now activated. This is the main entry point to the CLI's functionalities.
$ kudaf-generate --help
Usage: kudaf-generate [OPTIONS] COMMAND [ARGS]...
Kudaf Datasource Tools
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --install-completion Install completion for the current shell. │
│ --show-completion Show completion for the current shell, to copy it or customize the installation. │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ api Generate a Kudaf Datasource REST API back-end │
│ metadata Generate Variables/UnitTypes Metadata │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
As we can see, there are two sub-commands available: api
and metadata
.
We can obtain help on them as well:
$ kudaf-generate api --help
Usage: kudaf-generate api [OPTIONS]
Generate a Kudaf Datasource REST API back-end
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --config-yaml-path PATH Absolute path to the YAML configuration file │
│ [default: /home/me/current/directory/config.yaml] │
│ --input-data-files-dir PATH Absolute path to the data files directory │
│ [default: /home/me/current/directory] │
│ --output-api-dir PATH Absolute path to directory where the Datasource API folder is to be written │
│ to │
│ [default: /current/directory] │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
$ kudaf-generate metadata --help
Usage: kudaf-generate metadata [OPTIONS]
Generate Variables/UnitTypes Metadata
JSON metadata files ('variables.json' and maybe 'unit_types.json') will be written to the
(optionally) given output directory.
If any of the optional directories is not specified, the current directory is used as default.
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --config-yaml-path PATH Absolute path to the YAML configuration file │
│ [default: /home/me/current/directory/config.yaml] │
│ --output-metadata-dir PATH Absolute path to directory where the Metadata files are to be written to │
│ [default: /home/me/current/directory] │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Generating metadata only from a YAML configuration file
$ kudaf-generate metadata --config-yaml-path /home/me/path/to/config.yaml --output-metadata-dir /home/me/path/to/metadata/folder
Generating an API
$ kudaf-generate api --config-yaml-path /home/me/path/to/config.yaml --output-api-dir /home/me/path/to/api/folder
Local API launch
Navigate to the folder containing the generated API:
$ cd /home/me/path/to/api/folder
Create a Python virtual environment and activate it
$ python3 -m venv .venv
This created the virtualenv under the hidden folder .venv
Activate it with:
$ source .venv/bin/activate
Install needed Python packages
$ pip install -r requirements.txt
Launch the Kudaf Datasource API
$ uvicorn app.main:app
Browser: Navigate to the the API's interactive documentation at:
http://localhost:8000/docs
Docker API launch (Linux)
Install Docker for your desktop
Follow instructions for your desktop model at https://docs.docker.com/desktop/
Navigate to the folder containing the generated API:
$ cd /home/me/path/to/api/folder
Create a Docker image
$ sudo docker build -t my-kudaf-datasource-image .
(Note: The last .
in the above command is important! Make sure it's entered like that )
Create and launch a Docker container for the generated image
$ sudo docker run -d --name my-kudaf-datasource-container -p 9000:8000 my-kudaf-datasource-image
Browser: Navigate to the the API's interactive documentation at:
http://localhost:9000/docs
Developers
Download the package to your computer
Option A: Installation from repository:
Open up a Terminal window and clone the repo locally:
$ git clone https://gitlab.sikt.no/kudaf/kudaf-datasource-tools.git
Option B: Installation from source:
-
Open up your browser and navigate to the project's GitLab page: https://gitlab.sikt.no/kudaf/kudaf-datasource-tools
-
Once there, download a ZIP file with the source code

-
Move the zipped file to whichever directory you want to use for this installation
-
Open a Terminal window and navigate to the directory where the zipped file is
-
Unzip the downloaded file, it will create a folder called kudaf-datasource-tools-main
-
Switch to the newly created folder
$ cd path/to/kudaf-datasource-tools-main
Make sure Python3 is installed on your computer (versions from 3.8 up to 3.11 should work fine)
$ python3 --version
Install Poetry (Python package and dependency manager) on your computer
Full Poetry documentation can be found here: https://python-poetry.org/docs/
The official installer should work fine on the command line for Linux, macOS and Windows:
$ curl -sSL https://install.python-poetry.org | python3 -
If the installation was successful, configure this option:
$ poetry config virtualenvs.in-project true
Mac users: Troubleshooting
In case of errors installing Poetry on your Mac, you may have to try installing it with pipx
. But to install that, we need to have Homebrew
installed first.
$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
(Homebrew documentation: https://brew.sh/)
Once Homebrew
is installed, proceed to install pipx
:
$ brew install pipx
$ pipx ensurepath
Finally, install Poetry
:
$ pipx install poetry
Create a Python virtual environment and activate it
$ python3 -m venv .venv
This created the virtualenv under the hidden folder .venv
Activate it with:
$ source .venv/bin/activate
Install Kudaf Datasource Tools and other required Python packages
$ poetry install