OpenSAMPL
(Open Synchronization Analytics and Monitoring PLatform)
python tools for adding clock data to a timescale db.
CLI TOOL
Installation
- Ensure you have Python 3.9 or higher installed
- Pip install the latest version of opensampl:
pip install opensampl
Development Setup
uv venv
uv sync --extra all
source .venv/bin/activate
This will create a virtual environment and install the development dependencies.
Environment Setup
The tool requires several environment variables. Create a .env
file in your project root:
When routing through a backend:
ROUTE_TO_BACKEND=true
BACKEND_URL=http://localhost:8000
ARCHIVE_PATH=/path/to/archive
When directly accessing db:
DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<database>
ARCHIVE_PATH=/path/to/archive
Basic Usage
The CLI tool provides several commands. You can use opensampl --help
(or, any deeper opensampl [command] --help
) to get details
Load Probe Data
Load data from ADVA probes:
opensampl load probe adva path/to/file.txt.gz
opensampl load probe adva path/to/directory/
ADVA probes have all their metadata and their time data in each file, so no need to use the -m
or -t
options, though if you want to skip loading one or the other it becomes useful!
options:
--metadata
(-m
): Only load probe metadata
--time-data
(-t
): Only load time series data
--no-archive
(-n
): Don't archive processed files
--archive-path
(-a
): Override default archive directory
--max-workers
(-w
): Maximum number of worker threads (default: 4)
--chunk-size
(-c
): Number of time data entries per batch (default: 10000)
Load Direct Table Data
Load data directly into a database table. Format can be yaml or json. Can be a list of dictionaries or a single dictionary.
you do not have to specify schema, is assumed to be castdb.
The --if-exists option controls how to handle conflicts:
- update: Only update fields that are provided and non-default (default)
- error: Raise an error if entry exists
- replace: Replace all non-primary-key fields with new values
- ignore: Skip if entry exists
opensampl load table table_name path/to/data.yaml
So, you can do things like the following
opensampl load table locations --if-exists replace updated_location.yaml
Where this is the updated_location
name: EPB Chattanooga
lat: 35.9311256
lon: -84.3292469
And it will overwrite the existing entry for EPB Chattanooga, or create a new one if it doesn't exist yet.
View Configuration
Display current environment configuration:
poetry run opensampl config show
poetry run opensampl config show --explain
poetry run opensampl config show --var DATABASE_URL
Set Configuration
Update environment variables:
poetry run opensampl config set VARIABLE_NAME value
File Format Support
The tool currently supports ADVA probe data files with the following naming convention:
<ip_address>CLOCK_PROBE-<probe_id>-YYYY-MM-DD-HH-MM-SS.txt.gz
Example: 10.0.0.121CLOCK_PROBE-1-1-2024-01-02-18-24-56.txt.gz
CAST Database Schema Documentation
castdb.locations
Stores geographic locations with their coordinates and metadata. Supports both 2D and 3D point geometries.
name: "Lab A"
lat: 35.93
lon: -84.31
z: 100
projection: 4326
public: true
castdb.test_metadata
Tracks testing periods and experiments with start and end timestamps.
name: "Holdover Test 1"
start_date: "2024-01-01T00:00:00"
end_date: "2024-01-07T00:00:00"
castdb.probe_metadata
Contains information about timing probes, including their network location and associated metadata. Insertion handled by opensampl load probe
.
probe_id: "1-1"
ip_address: "10.0.0.121"
vendor: "ADVA"
model: "OSA 5422"
name: "GMC1"
public: true
location_uiid: "123e4567-e89b-12d3-a456-426614174000"
test_uiid: "123e4567-e89b-12d3-a456-426614174001"
castdb.probe_data
Time series data from probes, storing timestamps and measured values. Insertion handled by opensampl load probe
.
time: "2024-01-01T00:00:00"
probe_uuid: "123e4567-e89b-12d3-a456-426614174000"
value: 1.234e-09
castdb.adva_metadata
ADVA-specific configuration and status information for probes. Insertion handled by opensampl load probe
.
probe_uuid: "123e4567-e89b-12d3-a456-426614174000"
type: "Phase"
start: "2024-01-01T00:00:00"
frequency: 1
timemultiplier: 1
multiplier: 1
title: "ClockProbe1"
adva_probe: "ClockProbe"
adva_reference: "GPS"
adva_reference_expected_ql: "QL-NONE"
adva_source: "TimeClock"
adva_direction: "NA"
adva_version: 1.0
adva_status: "RUNNING"
adva_mtie_mask: "G823-PDH"
adva_mask_margin: 0
Notes
- All tables use UUIDs as primary keys which are automatically generated.
- Table relationships are maintained through UUID references
- Geographic coordinates use WGS84 projection (SRID 4326) by default
- Boolean fields (public) are optional and can be null