
Security News
Meet the Socket Team at RSAC and BSidesSF 2025
Join Socket for exclusive networking events, rooftop gatherings, and one-on-one meetings during BSidesSF and RSA 2025 in San Francisco.
Library to help ETL using Pyspark.
Sparta is a simple library to help you work on ETL builds using PySpark.
Install the latest version with pip install pysparta
This is a module with functions for extracting and reading data.
Example
from sparta.extract import read_with_schema
schema = 'epidemiological_week LONG, date DATE, order_for_place INT, state STRING, city STRING, city_ibge_code LONG, place_type STRING, last_available_confirmed INT'
path = '/content/sample_data/covid19-e0534be4ad17411e81305aba2d9194d9.csv'
df = read_with_schema(path, schema, {'header': 'true'}, 'csv')
This is a module with data transformation functions
Example
from sparta.transformation import drop_duplicates
cols = ['longitude','latitude']
df = drop_duplicates(df, 'population', cols)
This is a module with load and write functions.
Example
from sparta.load import create_hive_table
create_hive_table(df, "table_name", 5, "col1", "col2", "col3")
This is a module with several functions that can help in ETL work.
Example
from sparta.secret import get_secret_aws
get_secret_aws('Nome_Secret', 'sa-east-1')
Sparta currently supports PySpark 3.0+ and Python 3.7+.
FAQs
Library to help ETL using pyspark
We found that pysparta demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Join Socket for exclusive networking events, rooftop gatherings, and one-on-one meetings during BSidesSF and RSA 2025 in San Francisco.
Security News
Biome's v2.0 beta introduces custom plugins, domain-specific linting, and type-aware rules while laying groundwork for HTML support and embedded language features in 2025.
Security News
Next.js has patched a critical vulnerability (CVE-2025-29927) that allowed attackers to bypass middleware-based authorization checks in self-hosted apps.