Python Synapse Client
Branch | Build Status |
---|
develop | |
master | |
A Python client for Sage Bionetworks' Synapse, a collaborative, open-source research platform that allows teams to share data, track analyses, and collaborate. The Python client can be used as a library for development of software that communicates with Synapse or as a command-line utility.
There is also a Synapse client for R.
Documentation
For more information about the Python client, see:
For more information about interacting with Synapse, see:
For release information, see:
Installation
The Python Synapse client has been tested on versions 3.9, 3.10, 3.11 and 3.12 on Mac OS X, Ubuntu Linux and Windows.
Starting from Synapse Python client version 3.0, Synapse Python client requires Python >= 3.9
Install using pip
The Python Synapse Client is on PyPI and can be installed with pip:
# Here are a few ways to install the client. Choose the one that fits your use-case
# sudo may optionally be needed depending on your setup
pip install --upgrade synapseclient
pip install --upgrade "synapseclient[pandas]"
pip install --upgrade "synapseclient[pandas, pysftp, boto3]"
...or to upgrade an existing installation of the Synapse client:
# sudo may optionally be needed depending on your setup
pip install --upgrade synapseclient
The dependencies on pandas
, pysftp
, and boto3
are optional. Synapse
Tables integrate
with Pandas. The library pysftp
is required for users of
SFTP file storage. All
libraries require native code to be compiled or installed separately from prebuilt
binaries.
Install from source
Clone the source code repository.
git clone git://github.com/Sage-Bionetworks/synapsePythonClient.git
cd synapsePythonClient
pip install .
Alternatively, you can use pip to install a particular branch, commit, or other git reference:
pip install git+https://github.com/Sage-Bionetworks/synapsePythonClient@master
or
pip install git+https://github.com/Sage-Bionetworks/synapsePythonClient@my-commit-hash
Command line usage
The Synapse client can be used from the shell command prompt. Valid commands
include: query, get, cat, add, update, delete, and onweb. A few examples are
shown.
downloading test data from Synapse
synapse -p auth_token get syn1528299
getting help
synapse -h
Note that a Synapse account is required.
Usage as a library
The Synapse client can be used to write software that interacts with the Sage Bionetworks Synapse repository. More examples can be found in the Tutorial section found here
Examples
Log-in and create a Synapse object
import synapseclient
syn = synapseclient.Synapse()
## You may optionally specify the debug flag to True to print out debug level messages.
## A debug level may help point to issues in your own code, or uncover a bug within ours.
# syn = synapseclient.Synapse(debug=True)
## log in using auth token
syn.login(authToken='auth_token')
Sync a local directory to synapse
This is the recommended way of synchronizing more than one file or directory to a synapse project through the use of synapseutils
. Using this library allows us to handle scheduling everything required to sync an entire directory tree. Read more about the manifest file format in synapseutils.syncToSynapse
import synapseclient
import synapseutils
import os
syn = synapseclient.Synapse()
## log in using auth token
syn.login(authToken='auth_token')
path = os.path.expanduser("~/synapse_project")
manifest_path = f"{path}/my_project_manifest.tsv"
project_id = "syn1234"
# Create the manifest file on disk
with open(manifest_path, "w", encoding="utf-8") as f:
pass
# Walk the specified directory tree and create a TSV manifest file
synapseutils.generate_sync_manifest(
syn,
directory_path=path,
parent_id=project_id,
manifest_path=manifest_path,
)
# Using the generated manifest file, sync the files to Synapse
synapseutils.syncToSynapse(
syn,
manifestFile=manifest_path,
sendMessages=False,
)
Store a Project to Synapse
import synapseclient
from synapseclient.entity import Project
syn = synapseclient.Synapse()
## log in using auth token
syn.login(authToken='auth_token')
project = Project('My uniquely named project')
project = syn.store(project)
print(project.id)
print(project)
Store a Folder to Synapse (Does not upload files within the folder)
import synapseclient
syn = synapseclient.Synapse()
## log in using auth token
syn.login(authToken='auth_token')
folder = Folder(name='my_folder', parent="syn123")
folder = syn.store(folder)
print(folder.id)
print(folder)
Store a File to Synapse
import synapseclient
syn = synapseclient.Synapse()
## log in using auth token
syn.login(authToken='auth_token')
file = File(
path=filepath,
parent="syn123",
)
file = syn.store(file)
print(file.id)
print(file)
Get a data matrix
import synapseclient
syn = synapseclient.Synapse()
## log in using auth token
syn.login(authToken='auth_token')
## retrieve a 100 by 4 matrix
matrix = syn.get('syn1901033')
## inspect its properties
print(matrix.name)
print(matrix.description)
print(matrix.path)
## load the data matrix into a dictionary with an entry for each column
with open(matrix.path, 'r') as f:
labels = f.readline().strip().split('\t')
data = {label: [] for label in labels}
for line in f:
values = [float(x) for x in line.strip().split('\t')]
for i in range(len(labels)):
data[labels[i]].append(values[i])
## load the data matrix into a numpy array
import numpy as np
np.loadtxt(fname=matrix.path, skiprows=1)
Authentication
Authentication toward Synapse can be accomplished with the clients using personal access tokens. Learn more about Synapse personal access tokens
Learn about the multiple ways one can login to Synapse.
Synapse Utilities (synapseutils)
The purpose of synapseutils is to create a space filled with convenience functions that includes traversing through large projects, copying entities, recursively downloading files and many more.
Example
import synapseutils
import synapseclient
syn = synapseclient.login()
# copies all Synapse entities to a destination location
synapseutils.copy(syn, "syn1234", destinationId = "syn2345")
# copies the wiki from the entity to a destination entity. Only a project can have sub wiki pages.
synapseutils.copyWiki(syn, "syn1234", destinationId = "syn2345")
# Traverses through Synapse directories, behaves exactly like os.walk()
walkedPath = synapseutils.walk(syn, "syn1234")
for dirpath, dirname, filename in walkedPath:
print(dirpath)
print(dirname)
print(filename)
OpenTelemetry (OTEL)
OpenTelemetry helps support the analysis of traces and spans which can provide insights into latency, errors, and other performance metrics. The synapseclient is ready to provide traces should you want them. The Synapse Python client supports OTLP Exports and can be configured via environment variables as defined here.
Read more about OpenTelemetry in Python here
Quick-start
The following shows an example of setting up jaegertracing via docker and executing a simple python script that implements the Synapse Python client.
Running the jaeger docker container
Start a docker container with the following options:
docker run --name jaeger \
-e COLLECTOR_OTLP_ENABLED=true \
-p 16686:16686 \
-p 4318:4318 \
jaegertracing/all-in-one:latest
Explanation of ports:
Once the docker container is running you can access the Jaeger UI via: http://localhost:16686
Example
By default the OTEL exporter sends trace data to http://localhost:4318/v1/traces
, however you may override this by setting the OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
environment variable.
import synapseclient
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.resources import SERVICE_NAME, Resource
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
trace.set_tracer_provider(
TracerProvider(
resource=Resource(attributes={SERVICE_NAME: "my_own_code_above_synapse_client"})
)
)
trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
tracer = trace.get_tracer("my_tracer")
synapseclient.Synapse.enable_open_telemetry(True)
@tracer.start_as_current_span("my_span_name")
def main():
syn = synapseclient.Synapse()
syn.login()
my_entity = syn.get("syn52569429")
print(my_entity)
main()
License and Copyright
© Copyright 2013-23 Sage Bionetworks
This software is licensed under the Apache License, Version 2.0.