Security News
Opengrep Emerges as Open Source Alternative Amid Semgrep Licensing Controversy
Opengrep forks Semgrep to preserve open source SAST in response to controversial licensing changes.
Python wrapper for GENESIS web service interface (API) of the Federal Statistical Office.
pystatis
pystatis
is a Python wrapper for the different GENESIS web service interfaces (API). Currently we are supporting the following databases:
The main features are:
To learn more about GENESIS, please refer to the official documentation here.
The full documentation of the main and dev branches are hosted via GitHub Pages (main) and GitHub Pages (dev).
The new Zensus has finally arrived and been published. However, old credentials are no longer valid and the base url has changed, too. If you have worked with pystatis
and Zensus database before, you need to update your config. You can do so in two ways:
Create a new config:
pystatis.config.delete_config()
.pystatis
again, this will create a new default config.pystatis.setup_credentials()
.Update your config:
config.ini
and change both username and password as well as the base_url for the Zensus database.You can test your changes by calling pystatis.logincheck("zensus")
.
Zensus database now also supports authentication with an API token instead of the classic username and password.
You can get your token from the Webservice interface info page when being logged in.
You can use the token instead of your username, and leave the password blank. There is no extra field or parameter for the token, so you have to pass it as your username. That means pystatis
is already supporting it, you can just put it as your username and leave the password blank.
You can install the package via
pip install pystatis
If everything worked out correctly, you should be able to import pystatis
like this
import pystatis
print("Version: ", pystatis.__version__)
To be able to use the web service/API of either GENESIS-Online, Regionaldatenbank or Zensus, you have to be a registered user of the respective database. You can create your user here, here, or here.
Once you have a registered user, you can use your username and password as credentials for authentication against the web service/API.
You can use pystatis
with only one of the supported database or with all, it is simply about providing the right credentials. pystatis
will only use databases for which you have provided credentials.
Please follow this guide to set up pystatis
correctly.
All APIs provide a helloworld
endpoint that can be used to check your credentials.
from pystatis.helloworld import logincheck
logincheck("genesis")
If everything worked out, your setup is complete and you can start downloading data.
For more details, please study the provided examples notebooks.
The Genesis data structure consists of multiple elements as summarized in the image below.
This package currently supports retrieving the following data types:
pystatis
offers the Find
class to search for any piece of information within each database. Behind the scene it's using the find
endpoint.
Example:
from pystatis import Find
results = Find("Rohöl", "genesis") # Initiates object that contains all variables, statistics, tables and cubes
results.run() # Runs the query
results.tables.df # Results for tables
results.tables.get_code([1,2,3]) # Gets the table codes, e.g. for downloading the table
results.tables.get_metadata([1,2]) # Gets the metadata for the table
A complete overview of all use cases is provided in the example notebook for find.
At the moment, the package only supports the download of Tables.
Example for downloading a Table:
from pystatis import Table
t = Table(name="21311-0001") # data is not yet downloaded
t.get_data() # only now the data is either fetched from GENESIS or loaded from cache. If the data is downloaded from online, it will be also cached, so next time the data is loaded from cache. The default language of the data is German but it can be set to either German (de) or English (en) using the language parameter of get_data().
t.data # prettified data stored as pandas DataFrame
For more details, please study the provided sample notebook for tables.
When a table is queried, it will be put into cache automatically. The cache can be cleared using the following function:
from pystatis import clear_cache
clear_cache("21311-0001") # only deletes the data for the object with the specified name
clear_cache() # deletes the complete cache
Distributed under the MIT License. See LICENSE.txt
for more information.
A few ideas we should implement in the maybe-near future:
Contributions to this project are highly appreciated! You can either contact the maintainers, create an issue or directly create a pull request for your proposed changes:
git checkout -b feature/<descriptive-name>
)git commit -m 'Added NewFeature'
)git push origin feature/<descriptive-name>
)To contribute to this project, please follow these steps:
conda
: Run conda create -n pystatis python=3.11
. You can choose another Python version as long as it is supported by this package, see the pyproject.toml for supported Python versions.conda install poetry
.poetry install
to install all dependencies into the current conda environment (run poetry env info
to see the details of the current environment). Run poetry install --with dev
to receive all additional developer dependencies. poetry
has installed all dependencies for you, as well as the package pystatis
itself.poetry run pre-commit install
. This will activate the pre-commit hooks that will run prior every commit to ensure code quality.dev
branch and make sure it is up to date by running git pull
.git checkout -b <new-branch>
or git switch -c <new-branch>
. If possible, add an issue number to the branch name.poetry run pytest tests -s -vv --vcr-record=new_episodes
to see if all existing tests still run through. It is important to use poetry run
to call pytest
so that poetry
uses the created virtual environment and not the system's default Python interpreter. Alternatively, you can run poetry shell
to let poetry
activate the virtual environment for the current session. Afterwards, you can run pytest
as usual without any prefix. You can leave the poetry shell with the exit
command.poetry run pytest tests -s -vv --vcr-record=new_episodes
again to make sure your tests are also passed..pre-commit-config.yaml
. If any of these pre-hooks fails, your commit is declined and you have to fix the issues first.git switch dev
, run git pull
, switch back to your branch with git switch -
and either do a git rebase -i dev
or git merge dev
to get the latest changes in your current working branch. Solve all merge conflicts.dev
as target.To learn more about poetry
, see Dependency Management With Python Poetry by realpython.com.
Documentation can also be built locally by ensuring that pandoc is installed, e.g. via conda install pandoc
, and then running
cd docs && make clean && make html
from the project root directory. Besides providing parsed docstrings of the individual package modules, the full documentation currently mirrors most of the readme, like installation and usage. The mirroring crucially relies on the names of the section headers in the ReadMe, so change them with care!
More information on how to use sphinx is provided here.
FAQs
Python wrapper for GENESIS web service interface (API) of the Federal Statistical Office.
We found that pystatis demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Opengrep forks Semgrep to preserve open source SAST in response to controversial licensing changes.
Security News
Critics call the Node.js EOL CVE a misuse of the system, sparking debate over CVE standards and the growing noise in vulnerability databases.
Security News
cURL and Go security teams are publicly rejecting CVSS as flawed for assessing vulnerabilities and are calling for more accurate, context-aware approaches.