
Security News
Crates.io Implements Trusted Publishing Support
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
data-access-platform-cli
Advanced tools
CLI for the Data Access Platform service
npm install data-access-platform-cli
You will need to either access or create a developer key and create a config file.
Go to https://<account>.instructure.com/accounts/self/developer_keys
and see
if there is already a developer key provisioned for Canvas Data 2 with the
following settings:
If there is not a key already created for Canvas Data 2, create a new API key by following these steps:
Under the Account tab of the Developer Keys page, press the + Developer Key button and choose the + API Key option.
This will show a view for configuring a new key.
This will return you to the Account tab of the Developer Keys page.
Enable the key by selecting ON under the State.
Under the Details of the developer key, you’ll need the key ID (the number with about 11 digits that is showing already), and the key secret which you can see when you press the Show Key button.
You’ll put these two pieces of information into the config file after you install DAP CLI.
Beware: these credentials can be used by anyone who has them to use this CLI to download information from your Canvas instance.
Fill in your details in config.json
following the example provided by
config.example.json
. While you can specify a different path to config with the
--config
or -c
flags, by default it reads from config.json
.
You will need to change at least three values in your config.json
file.
url
key in the value
https://<account>.instructure.com/login/oauth2/token
)client_id
)client_secret
)You may also need to change your baseURL
.
To use:
dap <command> [options]
dap --version
dap --config myconfig.json
dap --help
or dap -h
Getting help for a given command, this will list the command specific options:
dap --help <command>
There are different levels of logging:
dap <command> [options] # 'info' logging level by default
dap <command> [options] -v # 'verbose' logging level
dap <command> [options] -vv # 'debug' logging level
This command will get the most recent version of all the requested tables. For
example, to grab users
, accounts
, and courses
:
dap snapshot users accounts courses
The default output location is the snapshots/
directory, but that can be
overridden with the --output
option.
The number of concurrent tables fetched at a time can be controlled with the
--concurrency
flag. Example:
dap snapshot users accounts courses submissions --concurrency 2
The downloaded file has the following format:
<table name>_<current date>.<file format>
The current date is the local time when the user runs the snapshot command.
# The time now is 11:49 PM. Today is the 24th of March, 2020
dap snapshot users
# The downloaded file will be:
# snapshots/users_2020-03-24-23-49-03.csv
Specific options:
Name | Description | Default value |
---|---|---|
concurrency | The number of tables to fetch concurrently | 10 |
format | The output format of data. Supported formats: csv, json | csv |
output | The directory to download to | snapshots |
filter | SQL query utilized by S3 select |
This command will get changes to the specified tables since the provided time. This time can be provided in two different ways: 1) an amount of time relative to now, and 2) and absolute time.
# This grabs changes from the last 4h
dap updates users --last 4h
# This grabs changes since a specific time
dap updates users --since '2020-02-21T09:03:00Z'
The relative time accepts many formats, including a number followed by:
m
for minutesh
for hoursd
for daysBy default, it will grab updates from the last 24 hours.
The default output location is the updates/
directory, but that can be
overridden with the --output
option.
The number of concurrent diffs fetched at a time can be controlled with the
--concurrency
flag.
The downloaded file has the following format:
<table name>_<since date>_<current date>.<file format>
Both since and current dates are local times.
# The time now is 12:34 AM. Today is the 14th of May, 2020
dap updates accounts --last 20d
# The downloaded file will be:
# updates/accounts_2020-04-24-00-34-28_2020-05-14-00-34-39.csv
Specific options:
Name | Description | Default value |
---|---|---|
concurrency | The number of tables to fetch concurrently | 10 |
last | The relative age for the oldest change in the query | 24h |
format | The output format of data. Supported formats: csv, json | csv |
output | The directory to download to | updates |
since | The ISO8601 date for the oldest change in the query (overrides --last) | |
until | The ISO8601 date for the newest change in the query | |
filter | SQL query utilized by S3 select | |
new_only | Fetch updates not yet received | boolean: not set |
collapse | Collapse multiple changes into one | boolean: not set |
Options of boolean type should be provided without a value or just skipped (not set).
This command will get information about the schema.
# List the available tables
dap schema --list
# Get the schema for some tables
dap schema users courses accounts
# Get the schema for all the tables
dap schema
Specific options:
Name | Description | Default value |
---|---|---|
list | List all the tables rather than dump schema | boolean: not set |
This command will get latency state of all the requested tables. For
example, to grab users
, accounts
, and courses
:
dap latency users accounts courses
Specific options:
Name | Description | Default value |
---|---|---|
concurrency | The number of tables to fetch concurrently | 10 |
In order to perform some operations you might need to filter output.
To filter, provide SQL statement to snapshot
/updates
cli command.
./bin/dap updates wikis -f '<your SQL statement>'
All rules from S3 select SQL are applicable and should be considered. Important precautions specific to CD2:
cdcMetadata
in output and
make subselection only from row
(but also select it as a row
identifier).
This is because output is used by CLI. Ideally use SELECT *
../bin/dap updates users -f "SELECT * FROM S3Object"
works much slower than
same command without filter (./bin/dap updates users
)In order to utilize S3 select you should know the file schema. On the top level
we have 2 separate fields: cdcMetadata
and row
. In output files structure
is different a bit, but for queries you need to consider this format.
This is a service field which has identical structure across all tables and
stores CDC data. In output files this field is named metadata
and has
a slightly different shape. The structure is:
table:string - name of the table
orderid:string - incremental counter. Unique per each event, monotonously increasing each time new update submitted
key: {
id:bigint - primary key value of the row associated with event
__dbz__physicaltableidentifier:string - internal id for debezium
}
deletion:boolean - indicator of row deletion
ts_ms:bigint - timestamp when change occured
root_account_uuid:string - account uuid, will be the same across all file
shard_id:int - database shard where event occurred
Row structure is the same as in output. You can get row's schema using
./bin/dap schema <tableName>
where tableName
is the name of the table you want a schema for.
Some valid examples of queries:
SELECT * FROM S3Object
SELECT * FROM S3Object[*] s WHERE s.row.id = '3'
Examples of cli usage
./bin/dap updates wikis -f 'SELECT * FROM S3Object[*]'
The CLI uses yargs commandDir to make it easy
to add new commands. Add a new command by creating a new module in
lib/commands/
. This module should contain:
exports.command
: string or array of strings that contains the command
exports.describe
: string with the description of the command
exports.builder
: object containing the command options or a function
accepting and returning a yargs instance
exports.handler
: function using the parsed argv
This structure assumes all modules in the commands
directory are command
modules. Any supporting files need to be in a different directory. See
snapshot.ts for an example command and
snapshot.test.ts for example tests.
See the Providing a Command Module docs for more details on these exports and the .commandDir(directory, [opts]) docs for more details about using a command module directory and more advanced options.
Using a docker container you can use:
./build.sh
or without you can run:
npm run test
npm run lint
npm run lint:md
FAQs
CLI for Data Access Platform
The npm package data-access-platform-cli receives a total of 0 weekly downloads. As such, data-access-platform-cli popularity was classified as not popular.
We found that data-access-platform-cli demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 3 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
Research
/Security News
Undocumented protestware found in 28 npm packages disrupts UI for Russian-language users visiting Russian and Belarusian domains.
Research
/Security News
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.