Security News
PyPI’s New Archival Feature Closes a Major Security Gap
PyPI now allows maintainers to archive projects, improving security and helping users make informed decisions about their dependencies.
synthetic_sample is a data generation application for producing synthetic sales transactions over a time series, including associated shipment and product data
Sample data is generated by running synthetic_sample_generator.py
and using
python3 synthetic_sample_generator.py --json_filepath JSON_FILEPATH --output_directory OUTPUT_DIRECTORY --create_records
where
json_filepath
is the filepath to the input JSON (see Request Requirements below)output_directory
is the directory to save output data to, in CSV formatcreate_records
is a flag that indicates that raw record data should also be saved to the output directory. Running without this
flag results in only aggregate output dataThe required input format is a JSON with the following fields:
start_date
: date in the first period to include, e.g. if 2020/02/15 is provided, the full week of that date will be includedend_date
: date in the last period to include, e.g. if 2020/02/15 is provided, the full week of that date will be includedannual_growth_factor
: year over year growth factor, 10% growth corresponds to a value of 1.1period_type
: indicates what type of curve to generate, supports "month" or "week"total_sales
: total number of sales for the periodtotal_packages
: total number of packages shipped for the periodtotal_quantity
: total number of items sold for the periodannual_sales
: annualized number of sales for the periodannual_packages
: annualized number of packages shipped for the periodannual_quantity
: annualized number of items sold for the periodcurve_definition
: Definition of the curve to create, either as a list of dictionaries with each feature or as a
string indicating the name of the default curve to use.
anchor_type
: Type of annual anchor used to define the feature
anchor_point
: Annual point to define the feature
anchor_value
: Cumulative percent of total sales (0.0-1.0) completed by the end of the period of the anchor_pointrelative_start
: Number of periods before the anchor_point to define a relative cumulative percent valuestart_value
: Cumulative percent of total sales (0.0-1.0) completed by the end of the period indicated by relative_startrelative_end
: Number of periods before the anchor_point to define a relative cumulative percent valueend_value
: Cumulative percent of total sales (0.0-1.0) completed by the end of the period indicated by relative_endsynthetic_sample/defaults/curves/{period_type}/{curve_definition}.json
modern_brand
modern_distributor
traditional_brand
traditional_distributor
default_type
: string indicating the type of defaults to use, these can be found as JSON in synthetic_sample/defaults/lib/
product_distribution
: dictionary of product labels (i.e. SKUs) and their relative weightsweek_distribution
: dictionary of weeks of the month (where 1 is the first week and -1 is the last) and their relative weightsweekday_distribution
: dictionary of weekdays (where 0 is Monday and 6 is Sunday) and their relative weightsseasonal_distribution
: dictionary of seasons ("Q1"..."Q4") and their relative weightsmodifiers
: list of any modifiers to apply.
The below request will generate data for each month starting 2018-06 and ending 2020-12.
{
"start_date": "2018-06-01",
"end_date": "2020-12-31",
"total_sales": 1000000,
"total_packages": 1500000,
"total_quantity": 6000000,
"annual_growth_factor": 1.15,
"product_distribution": {
"AAA-01" : 1,
"AAA-02" : 2.5,
"AAA-11" : 5.6,
"BBB-10" : 0.5,
"BBB-20" : 1
},
"week_distribution": {
"1": 0.1,
"-1": 0.5
},
"weekday_distribution": {
"0": 0.0,
"1": 0.0,
"2": 0.0,
"3": 0.0,
"4": 0.0,
"5": 2.0,
"6": 1.0
},
"seasonal_distribution": {
"Q1": 1,
"Q2": 1,
"Q3": 1,
"Q4": 1
},
"period_type": "month",
"curve_definition": [
{
"anchor_type": "month_of_year",
"anchor_point": 1,
"anchor_value": 0.0424
},
{
"anchor_type": "month_of_year",
"anchor_point": 2,
"anchor_value": 0.103
},
{
"anchor_type": "month_of_year",
"anchor_point": 3,
"anchor_value": 0.203
},
{
"anchor_type": "month_of_year",
"anchor_point": 4,
"anchor_value": 0.3152
},
{
"anchor_type": "month_of_year",
"anchor_point": 5,
"anchor_value": 0.4139
},
{
"anchor_type": "month_of_year",
"anchor_point": 6,
"anchor_value": 0.4776
},
{
"anchor_type": "month_of_year",
"anchor_point": 7,
"anchor_value": 0.5321
},
{
"anchor_type": "month_of_year",
"anchor_point": 8,
"anchor_value": 0.5897
},
{
"anchor_type": "month_of_year",
"anchor_point": 9,
"anchor_value": 0.6715
},
{
"anchor_type": "month_of_year",
"anchor_point": 10,
"anchor_value": 0.7836
},
{
"anchor_type": "month_of_year",
"anchor_point": 11,
"anchor_value": 0.9018
},
{
"anchor_type": "month_of_year",
"anchor_point": 12,
"anchor_value": 1.0
}
]
}
FAQs
A generator for synthetic sales data
We found that synthetic-sample demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
PyPI now allows maintainers to archive projects, improving security and helping users make informed decisions about their dependencies.
Research
Security News
Malicious npm package postcss-optimizer delivers BeaverTail malware, targeting developer systems; similarities to past campaigns suggest a North Korean connection.
Security News
CISA's KEV data is now on GitHub, offering easier access, API integration, commit history tracking, and automated updates for security teams and researchers.