Security News
Opengrep Emerges as Open Source Alternative Amid Semgrep Licensing Controversy
Opengrep forks Semgrep to preserve open source SAST in response to controversial licensing changes.
Abstraction layer for interacting with Microsoft Dataverse Web API using Python.
dataverse-api
The dataverse-api
package is an abstraction layer developed for allowing simple interaction with Microsoft Dataverse Web API.
The main goal of this project was to enable some use-cases against Dataverse that I wanted to explore for a work assignment, while getting some experience in programming and testing out different ways of setting up the codebase.
The functionality I have built into this Dataverse wrapper constitutes the functionality I have wanted to explore myself.
Most important is to enable creating, upserting, updating and deleting rows of data into Dataverse tables using common data structures, and implementing choices on how these requests are to be formed. For example, when creating new rows, the user can choose between individual POST
requests per row, combining data into batch requests against the $batch
endpoint, or even to use the CreateMultiple
Dataverse action.
The framework is written in Python 3.11, seeing as this runtime is available in the current release of Azure Functions.
Usage is fairly simple - authentication must be handled by the user. The DataverseClient
simply accepts an already authorized requests.Session
with which to handle API requests.
I suggest using msal
and msal-requests-auth
for authenticating the Session
. The examples below include this way of implementing auth:
import os
from msal import ConfidentialClientApplication
from msal_requests_auth.auth import ClientCredentialAuth
from requests import Session
from dataverse_api import DataverseClient
# Prepare Auth
app_reg = ConfidentialClientApplication(
client_id=os.getenv("app_id"),
client_credential=os.getenv("client_secret"),
authority=os.getenv("authority_url"),
)
environment_url = os.getenv("environment_url")
auth = ClientCredentialAuth(
client=app_reg,
scopes=[environment_url + "/.default"]
)
# Prepare Session
session = Session()
session.auth = auth
# Instantiate DataverseClient
client = DataverseClient(session=session, environment_url=environment_url)
# Instantiate interface to Entity
entity = client.entity(logical_name="organization")
# Read data!
entity.read(select=["name"])
poetry is used for managing dependencies. To develop
dataverse-api
, follow the below steps to set up your local environment:
Install poetry if you haven't already.
Clone repository:
$ git clone git@github.com:MarcusRisanger/dataverse-api.git
Move into the newly created local repository:
$ cd dataverse-api
Create virtual environment and install dependencies:
$ poetry install
All code must pass ruff style checks to be merged. It is recommended to install pre-commit hooks to ensure this locally before commiting code:
$ poetry run pre-commit install
Each public method, class and module should have docstrings. Docstrings are written in the Numpy style.
To produce Coverage reports, run the following commands:
$ poetry run coverage run -m pytest
$ poetry run coverage xml
CreateMultiple
request if creating more than 100 elements?
For now, I've coded the framework around the requests
library, for good and bad! In the future, I will consider generalizing further, letting the user pass an authenticated requests handler of choice to the framework by specifying a Protocol
to follow instead.
To instantiate, pass a requests.Session
together with a Dataverse environment URL to the DataverseClient
constructor:
session = Session()
session.auth = auth
environment_url = os.getenv("dataverse_url")
client = DataverseClient(session=session, environment_url=environment_url)
It is possible to create a new Entity using the DataverseClient
. This requires a full EntityMetadata
definition according to Dataverse standards. You can make this yourself and follow the MetadataDumper
protocol, or use the provided define_entity
function.
The define_label
function makes it simple to generate Label
metadata with correct LocalizedLabels
in its payload.
In the example below, the optional return_representation
argument has been set to True
to receive the full Entity metadata definition as created by Dataverse as part of the server response. The response can be parsed by EntityMetadata
classmethod to get a full fledged object for editing.
from dataverse_api.metadata.attributes import StringAttributeMetadata
from dataverse_api.metadata.entity import EntityMetadata, define_entity
from dataverse_api.utils.labels import define_label
new_entity = define_entity(
schema_name="new_name",
attributes=[StringAttributeMetadata(
schema_name="new_primary_col",
is_primary_name = True,
description=define_label("Primary column for Entity."),
display_name=define_label("Autonumber Column"),
auto_number_format="{SEQNUM:6}-#-{RANDSTRING:3}")],
description=define_label("Entity Created by Client"),
display_name=define_label("Programmatically Created Table")
)
resp = client.create_entity(new_entity, return_representation=True)
entity_meta = EntityMetadata.model_validate_dataverse(resp.json())
You can update an existing Entity definition easily by retrieving the Entity metadata definition, and reupload an adjusted version.
Below is a simple example. Note that this method also supports return_representation
in the same manner as the DataverseClient.create_entity()
method, if you want to return the edited Entity metadata as persisted in Dataverse.
metadata = client.get_entity_definition("new_name")
metadata.display_name.localized_labels[0].label = "Overridden Display Name"
client.update_entity(metadata)
To initializes an interface with a specific Dataverse Entity, use the DataverseClient.entity()
method. It returns a DataverseEntity
object that allows interaction with this specific entity.
foo = client.entity(logical_name="foo")
As of now, only LogicalName
is supported for instantiating a new DataverseEntity
object.
The DataverseEntity.read()
method has been furnished with the necessary arguments to do querying as specified in the Microsoft Dataverse documentation.
A simple example:
data = foo.read(select=["name","address"], filter="salary gt 10000", top=5, order_by="salary desc")
To create rows, you can use a pandas.DataFrame
or a simple construct like a list of dicts, where each dict contains the data for a single row.
Below is an example of creating rows in the Entity by passing a dataframe and specifying that the creation method should be the CreateMultiple
web API Action. The return_created
argument can be set to True
if you need the IDs as reference.
foo.create(data=df, mode="multiple", return_created=True)
Note that the different modes provide different content when return_created
is set to True
- the script simply sets a Prefer
header to include created data in the server response.
For now the user may choose how to handle this based on the list of requests.Response
objects that will be returned by the method.
Upserting data into Dataverse is simple. If you are just updating existing data you may have the URI (Primary Attribute ID) in your data. You can then omit the alternate_key
argument.
foo.upsert(data=df, alternate_key="my_key", mode="batch")
TBD
TBD
TBD
FAQs
Abstraction layer for interacting with Microsoft Dataverse Web API using Python.
We found that dataverse-api demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Opengrep forks Semgrep to preserve open source SAST in response to controversial licensing changes.
Security News
Critics call the Node.js EOL CVE a misuse of the system, sparking debate over CVE standards and the growing noise in vulnerability databases.
Security News
cURL and Go security teams are publicly rejecting CVSS as flawed for assessing vulnerabilities and are calling for more accurate, context-aware approaches.