
Research
/Security News
Weaponizing Discord for Command and Control Across npm, PyPI, and RubyGems.org
Socket researchers uncover how threat actors weaponize Discord across the npm, PyPI, and RubyGems ecosystems to exfiltrate sensitive data.
tap-spreadsheets
Advanced tools
tap-spreadsheets
is a Singer tap for spreadsheets.
Built with the Meltano Tap SDK for Singer Taps.
catalog
state
discover
activate-version
about
stream-maps
schema-flattening
batch
Setting | Required | Default | Description |
---|---|---|---|
files | True | None | List of file configurations. |
stream_maps | False | None | Config object for stream maps capability. For more information check out Stream Maps. |
stream_maps.else | False | None | Currently, only setting this to __NULL__ is supported. This will remove all other streams. |
stream_map_config | False | None | User-defined config values to be used within map expressions. |
faker_config | False | None | Config for the Faker instance variable fake used within map expressions. Only applicable if the plugin specifies faker as an additional dependency (through the singer-sdk faker extra or directly). |
faker_config.seed | False | None | Value to seed the Faker generator for deterministic output: https://faker.readthedocs.io/en/master/#seeding-the-generator |
faker_config.locale | False | None | One or more LCID locale strings to produce localized output for: https://faker.readthedocs.io/en/master/#localization |
flattening_enabled | False | None | 'True' to enable schema flattening and automatically expand nested properties. |
flattening_max_depth | False | None | The max depth to flatten schemas. |
batch_config | False | None | Configuration for BATCH message capabilities. |
batch_config.encoding | False | None | Specifies the format and compression of the batch files. |
batch_config.encoding.format | False | None | Format to use for batch files. |
batch_config.encoding.compression | False | None | Compression format to use for batch files. |
batch_config.storage | False | None | Defines the storage layer to use when writing batch files |
batch_config.storage.root | False | None | Root path to use when writing batch files. |
batch_config.storage.prefix | False | None | Prefix to use when writing batch files. |
A full list of supported settings and capabilities is available by running: tap-spreadsheets --about
files
(array) List of file configurations. Each entry is an object with keys:
path
(string, required): Glob expression (local or S3).format
(string): 'excel' or 'csv'.worksheet
(string, required for type excel): Worksheet index, name or regular expression (Excel only). Using regular expressions, any matching worksheet will be processed.table_name
(string): Optional stream name (defaults to file name).primary_keys
(array): List of PK column names.drop_empty
(boolean): Drop rows with empty/null PKs.skip_columns
(integer): Number of leading columns to skip.skip_rows
(integer): Rows to skip before headers.sample_rows
(integer): Rows to sample for schema inference.column_headers
(array): Explicit column headers.delimiter
(string): CSV delimiter. Inferred if not provided or default to ",".quotechar
(string): CSV quote char. Inferred if not provided or default '"'.schema_overrides
(dict): Overrrides JSON schema definition per field. Eg. schema_overrides: { my_column_name: { type: [string, "null"] } }
config:
files:
- path: data/*.xlsx
format: excel
# table_name: test_sheet1
primary_keys: [date]
drop_empty: true
worksheet: Sheet1
- path: data/*.xlsx
format: excel
worksheet: "Report 20[0-9]{2}"
table_name: my_xlsx_sheet2
primary_keys: [date, total]
drop_empty: true
skip_columns: 1
skip_rows: 4
- path: s3://my-bucket/reports/*.csv
format: csv
table_name: csv_reports
primary_keys: [id]
delimiter: ";"
quotechar: "'"
To use an S3-based storage ensure to provide those envirnoment variables:
S3_ACCESS_KEY_ID
, S3_SECRET_ACCESS_KEY
access key/secret pairS3_ENDPOINT_URL
Custom S3 endpoint such as minio or compatible interfaceExample:
S3_ACCESS_KEY_ID=minioadmin S3_SECRET_ACCESS_KEY=minioadmin S3_ENDPOINT_URL=http://localhost:19000 meltano run tap-spreadsheets target-jsonl
A full list of supported settings and capabilities for this tap is available by running:
tap-spreadsheets --about
This Singer tap will automatically import any environment variables within the working directory's
.env
if the --config=ENV
is provided, such that config values will be considered if a matching
environment variable is set either in the terminal context or in the .env
file.
Install from PyPI:
Install from GitHub:
uv tool install git+https://github.com/ORG_NAME/tap-spreadsheets.git@main
You can easily run tap-spreadsheets
by itself or in a pipeline using Meltano.
tap-spreadsheets --version
tap-spreadsheets --help
tap-spreadsheets --config CONFIG --discover > ./catalog.json
Follow these instructions to contribute to this project.
Prerequisites:
uv sync
Create tests within the tests
subfolder and
then run:
uv run pytest
You can also test the tap-spreadsheets
CLI interface directly using uv run
:
uv run tap-spreadsheets --help
Note: This tap will work in any Singer environment and does not require Meltano. Examples here are for convenience and to streamline end-to-end orchestration scenarios.
Next, install Meltano (if you haven't already) and any needed plugins:
# Install meltano
uv tool install meltano
# Initialize meltano within this directory
cd tap-spreadsheets
meltano install
Now you can test and orchestrate using Meltano:
# Test invocation:
meltano invoke tap-spreadsheets --version
# OR run a test ELT pipeline:
meltano run tap-spreadsheets target-jsonl
See the dev guide for more instructions on how to use the SDK to develop your own taps and targets.
FAQs
Singer tap for spreadsheets, built with the Meltano Singer SDK.
We found that tap-spreadsheets demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
Socket researchers uncover how threat actors weaponize Discord across the npm, PyPI, and RubyGems ecosystems to exfiltrate sensitive data.
Security News
Socket now integrates with Bun 1.3’s Security Scanner API to block risky packages at install time and enforce your organization’s policies in local dev and CI.
Research
The Socket Threat Research Team is tracking weekly intrusions into the npm registry that follow a repeatable adversarial playbook used by North Korean state-sponsored actors.