Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

rdata

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

rdata

Read R datasets from Python.

0.11.2
PyPI

Sorry we don't scan binary artifacts yet

Maintainers: 1

rdata

Read R datasets from Python.

.. Github does not support include in README for dubious security reasons, so we copy-paste instead. Also Github does not understand Sphinx directives. .. include:: docs/index.rst .. include:: docs/simpleusage.rst

The package rdata offers a lightweight way to import R datasets/objects stored in the ".rda" and ".rds" formats into Python. Its main advantages are:

It is a pure Python implementation, with no dependencies on the R language or related libraries. Thus, it can be used anywhere where Python is supported, including the web using Pyodide <https://pyodide.org/>__.
It attempt to support all R objects that can be meaningfully translated. As opposed to other solutions, you are no limited to import dataframes or data with a particular structure.
It allows users to easily customize the conversion of R classes to Python ones. Does your data use custom R classes? Worry no longer, as it is possible to define custom conversions to the Python classes of your choosing.
It has a permissive license (MIT). As opposed to other packages that depend on R libraries and thus need to adhere to the GPL license, you can use rdata as a dependency on MIT, BSD or even closed source projects.

Installation

rdata is on PyPi and can be installed using :code:pip:

.. code::

pip install rdata

It is also available for :code:conda using the :code:conda-forge channel:

.. code::

conda install -c conda-forge rdata

Installing the develop version

The current version from the develop branch can be installed as

.. code::

pip install git+https://github.com/vnmabus/rdata.git@develop

Documentation

The documentation of rdata is in ReadTheDocs <https://rdata.readthedocs.io/>__.

Examples

Examples of use are available in ReadTheDocs <https://rdata.readthedocs.io/en/stable/auto_examples/>__.

Simple usage

Read a R dataset

The common way of reading an R dataset is the following one:

.. code:: python

import rdata

converted = rdata.read_rda(rdata.TESTDATA_PATH / "test_vector.rda")
converted

which results in

.. code::

{'test_vector': array([1., 2., 3.])}

Under the hood, this is equivalent to the following code:

.. code:: python

import rdata

parsed = rdata.parser.parse_file(rdata.TESTDATA_PATH / "test_vector.rda")
converted = rdata.conversion.convert(parsed)
converted

This consists on two steps:

#. First, the file is parsed using the function rdata.parser.parse_file <https://rdata.readthedocs.io/en/latest/modules/rdata.parser.parse_file.html>. This provides a literal description of the file contents as a hierarchy of Python objects representing the basic R objects. This step is unambiguous and always the same. #. Then, each object must be converted to an appropriate Python object. In this step there are several choices on which Python type is the most appropriate as the conversion for a given R object. Thus, we provide a default rdata.conversion.convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.convert.html> routine, which tries to select Python objects that preserve most information of the original R object. For custom R classes, it is also possible to specify conversion routines to Python objects.

Convert custom R classes

The basic convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.convert.html>__ routine only constructs a SimpleConverter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html>__ object and calls its convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html#rdata.conversion.SimpleConverter.convert>__ method. All arguments of convert <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.convert.html>__ are directly passed to the SimpleConverter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html>__ initialization method.

It is possible, although not trivial, to make a custom Converter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.Converter.html>__ object to change the way in which the basic R objects are transformed to Python objects. However, a more common situation is that one does not want to change how basic R objects are converted, but instead wants to provide conversions for specific R classes. This can be done by passing a dictionary to the SimpleConverter <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.SimpleConverter.html>__ initialization method, containing as keys the names of R classes and as values, callables that convert a R object of that class to a Python object. By default, the dictionary used is DEFAULT_CLASS_MAP <https://rdata.readthedocs.io/en/latest/modules/rdata.conversion.DEFAULT_CLASS_MAP.html>, which can convert commonly used R classes such as data.frame <https://www.rdocumentation.org/packages/base/topics/data.frame> and factor <https://www.rdocumentation.org/packages/base/topics/factor>__.

As an example, here is how we would implement a conversion routine for the factor class to bytes <https://docs.python.org/3/library/stdtypes.html#bytes>__ objects, instead of the default conversion to Pandas Categorical <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Categorical.html#pandas.Categorical>__ objects:

.. code:: python

import rdata

def factor_constructor(obj, attrs):
    values = [bytes(attrs['levels'][i - 1], 'utf8')
              if i >= 0 else None for i in obj]

    return values

new_dict = {
    **rdata.conversion.DEFAULT_CLASS_MAP,
    "factor": factor_constructor
}

converted = rdata.read_rda(
    rdata.TESTDATA_PATH / "test_dataframe.rda",
    constructor_dict=new_dict,
)
converted

which has the following result:

.. code::

{'test_dataframe':   class  value
    1     b'a'      1
    2     b'b'      2
    3     b'b'      3}

Additional examples

Additional examples illustrating the functionalities of this package can be found in the ReadTheDocs documentation <https://rdata.readthedocs.io/en/latest/auto_examples/index.html>__.

.. |build-status| image:: https://github.com/vnmabus/rdata/actions/workflows/main.yml/badge.svg?branch=master :alt: build status :scale: 100% :target: https://github.com/vnmabus/rdata/actions/workflows/main.yml

.. |docs| image:: https://readthedocs.org/projects/rdata/badge/?version=latest :alt: Documentation Status :scale: 100% :target: https://rdata.readthedocs.io/en/latest/?badge=latest

.. |coverage| image:: http://codecov.io/github/vnmabus/rdata/coverage.svg?branch=develop :alt: Coverage Status :scale: 100% :target: https://codecov.io/gh/vnmabus/rdata/branch/develop

.. |repostatus| image:: https://www.repostatus.org/badges/latest/active.svg :alt: Project Status: Active – The project has reached a stable, usable state and is being actively developed. :target: https://www.repostatus.org/#active

.. |versions| image:: https://img.shields.io/pypi/pyversions/rdata :alt: PyPI - Python Version :scale: 100%

.. |pypi| image:: https://badge.fury.io/py/rdata.svg :alt: Pypi version :scale: 100% :target: https://pypi.python.org/pypi/rdata/

.. |conda| image:: https://anaconda.org/conda-forge/rdata/badges/version.svg :alt: Conda version :scale: 100% :target: https://anaconda.org/conda-forge/rdata

.. |zenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.6382237.svg :alt: Zenodo DOI :scale: 100% :target: https://doi.org/10.5281/zenodo.6382237

.. |pyOpenSci| image:: https://tinyurl.com/y22nb8up :alt: pyOpenSci: Peer reviewed :scale: 100% :target: https://github.com/pyOpenSci/software-submission/issues/144

Keywords

FAQs

What is rdata?

Is rdata well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install