Socket
Socket
Sign inDemoInstall

istatapi

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

istatapi

Python API for ISTAT (The Italian National Institute of Statistics)


Maintainers
1

istatapi is a Python interface to discover and retrieve data from ISTAT API (The Italian National Institute of Statistics). The library is designed to explore ISTAT metadata and to retreive data in different formats. istatapi is built on top of ISTAT SDMX RESTful API.

Whether you are an existing organization, a curious individual or an academic researcher, istatapi aims to allow you to easily access ISTAT databases with just a few lines of code. The library implements functions to:

  • Explore all available ISTAT datasets (dataflows in SDMX terminology)
  • Search available datasets by keywords
  • Retrieve information on a specific dataset like: the ID of the dataflow, the names and available values of the dimensions of the dataset, available filters.
  • Get data of an available dataset in a pandas DataFrame, csv or json format.

Install

You can easily install the library by using the pip command:

pip install istatapi

Tutorial

First, let’s simply import the modules we need:

from istatapi import discovery, retrieval
import matplotlib.pyplot as plt

With istatapi we can search through all the available datasets by simply using the following function:

discovery.all_available()
df_idversiondf_descriptiondf_structure_id
0101_10151.3CropsDCSP_COLTIVAZIONI
1101_10301.0PDO, PGI and TSG quality productsDCSP_DOPIGP
2101_10331.0slaughteringDCSP_MACELLAZIONI
3101_10391.2Agritourism - municipalitiesDCSP_AGRITURISMO_COM
4101_10771.0PDO, PGI and TSG products: operators - municipalities dataDCSP_DOPIGP_COM

You can also search for a specific dataset (in this example, a dataset on imports), by doing:

discovery.search_dataset("import")
df_idversiondf_descriptiondf_structure_id
10101_9621.0Livestock import exportDCSP_LIVESTIMPEXP
47139_1761.0Import and export by country and commodity Nace 2007DCSP_COEIMPEX1
49143_2221.0Import price index - monthly dataDCSC_PREIMPIND

To retrieve data from a specific dataset, we first need to create an instance of the DataSet class. We can use df_id, df_description or df_structure_id from the above DataFrame to tell to the DataSet class what dataset we want to retrieve. Here, we are going to use the df_id value. This may take a few seconds to load.

# initialize the dataset and get its dimensions
ds = discovery.DataSet(dataflow_identifier="139_176")

We now want to see what variables are included in the dataset that we are analysing. With istatapi we can easily print its variables (“dimensions” in ISTAT terminology) and their description.

ds.dimensions_info()
dimensiondimension_IDdescription
0FREQCL_FREQFrequency
1MERCE_ATECO_2007CL_ATECO_2007_MERCECommodity Nace 2007
2PAESE_PARTNERCL_ISOGeopolitics
3ITTER107CL_ITTER107Territory
4TIPO_DATOCL_TIPO_DATO12Data type 12

Now, each dimension can have a few possible values. istatapi provides a quick method to analyze these values and print their English descriptions.

dimension = "TIPO_DATO" #use "dimension" column from above
ds.get_dimension_values(dimension)
values_idsvalues_description
0EVexport - value (euro)
1TBVtrade balance - value (euro)
2ISAVimport - seasonally adjusted value - world based model (millions of euro)
3ESAVexport - seasonally adjusted value - world based model (millions of euro)
4TBSAVtrade balance - seasonally adjusted value -world based model (millions of euro)
5IVimport - value (euro)

If we do not filter any of our variables, the data will just include all the possible values in the dataset. This could result in too much data that would slow our code and make it difficult to analyze. Thus, we need to filter our dataset. To do so, we can simply use the values_ids that we found using the function get_dimension_values in the cell above.

Note: Make sure to pass the names of the dimensions in lower case letters as arguments of the set_filter function. If you want to filter for multiple values, simply pass them as lists.

freq = "M" #monthly frequency
tipo_dato = ["ISAV", "ESAV"] #imports and exports seasonally adjusted data
paese_partner = "WORLD" #trade with all countries

ds.set_filters(freq = freq, tipo_dato = tipo_dato, paese_partner = paese_partner)

Having set our filters, we can now finally retrieve the data by simply passing our DataSet instance to the function get_data. It will return a pandas DataFrame with all the data that we requested. The data will be already sorted by datetime

trade_df = retrieval.get_data(ds)
trade_df.head()
DATAFLOWFREQMERCE_ATECO_2007PAESE_PARTNERITTER107TIPO_DATOTIME_PERIODOBS_VALUEBREAKCONF_STATUSOBS_PRE_BREAKOBS_STATUSBASE_PERUNIT_MEASUNIT_MULTMETADATA_ENMETADATA_IT
0IT1:139_176(1.0)M10WORLDITTOTESAV1993-01-0110767NaNNaNNaNNaNNaNNaNNaNNaNNaN
368IT1:139_176(1.0)M10WORLDITTOTISAV1993-01-019226NaNNaNNaNNaNNaNNaNNaNNaNNaN
372IT1:139_176(1.0)M10WORLDITTOTISAV1993-02-0110015NaNNaNNaNNaNNaNNaNNaNNaNNaN
4IT1:139_176(1.0)M10WORLDITTOTESAV1993-02-0110681NaNNaNNaNNaNNaNNaNNaNNaNNaN
373IT1:139_176(1.0)M10WORLDITTOTISAV1993-03-019954NaNNaNNaNNaNNaNNaNNaNNaNNaN

Now that we have our data, we can do whatever we want with it. For example, we can plot the data after having it cleaned up a bit. You are free to make your own analysis!

# set matplotlib themes
plt.style.use('fivethirtyeight')
plt.rcParams['figure.figsize'] = [16, 5]

#fiveThirtyEight palette
colors = ['#30a2da', '#fc4f30', '#e5ae38', '#6d904f', '#8b8b8b']

# calculate moving averages for the plot
trade_df["MA_3"] = trade_df.groupby("TIPO_DATO")["OBS_VALUE"].transform(
    lambda x: x.rolling(window=3).mean()
)

#replace the "TIPO_DATO" column values with more meaningful labels
trade_df["TIPO_DATO"] = trade_df["TIPO_DATO"].replace(
    {"ISAV": "Imports", "ESAV": "Exports"}
)

# Plot the data
after_2013 = trade_df["TIME_PERIOD"] >= "2013"
is_ESAV = trade_df["TIPO_DATO"] == "Exports"
is_ISAV = trade_df["TIPO_DATO"] == "Imports"

exports = trade_df[is_ESAV & after_2013].rename(columns={"OBS_VALUE": "Exports", "MA_3": "Exports - three months moving average"})
imports = trade_df[is_ISAV & after_2013].rename(columns={"OBS_VALUE": "Imports", "MA_3": "Imports - three months moving average"})

plt.plot(
    "TIME_PERIOD",
    "Exports",
    data=exports,
    marker="",
    linestyle="dashed",
    color = colors[0],
    linewidth=1
)
plt.plot(
    "TIME_PERIOD",
    "Imports",
    data=imports,
    marker="",
    linestyle="dashed",
    color = colors[1],
    linewidth=1
)
plt.plot(
    "TIME_PERIOD",
    "Exports - three months moving average",
    data=exports,
    color = colors[0],
    linewidth=2
)
plt.plot(
    "TIME_PERIOD",
    "Imports - three months moving average",
    data=imports,
    marker="",
    color = colors[1],
    linewidth=2
)

# add a title
plt.title("Italy's trade with the world")

# add a label to the x axis
plt.xlabel("Year")

# turn y scale from millions to billions (divide by a 1000), and add a label
plt.ylabel("Value in billions of euros")
plt.gca().yaxis.set_major_formatter(plt.FuncFormatter(lambda x, loc: "{:,}".format(int(x/1000))))
plt.legend()

With just a few lines of code, we have been able to retrieve data from ISTAT and make a simple plot. This is just a simple example of what you can do with istatapi. You can find more examples in the _examples folder. Enjoy!

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc