
Security News
Crates.io Users Targeted by Phishing Emails
The Rust Security Response WG is warning of phishing emails from rustfoundation.dev targeting crates.io users.
S3 Concat is used to concatenate many small files in an s3 bucket into fewer larger files.
pip install s3-concat
$ s3-concat -h
from s3_concat import S3Concat
bucket = "YOUR_BUCKET_NAME"
path_to_concat = "PATH_TO_FILES_TO_CONCAT"
concatenated_file = "FILE_TO_SAVE_TO.json"
# Setting this to a size will always add a part number at the end of the file name
min_file_size = "50MB" # ex: FILE_TO_SAVE_TO-1.json, FILE_TO_SAVE_TO-2.json, ...
# Setting this to None will concat all files into a single file
# min_file_size = None ex: FILE_TO_SAVE_TO.json
# Init the job
job = S3Concat(bucket, concatenated_file, min_file_size,
content_type="application/json",
# source_bucket="SOURCE_BUCKET_NAME", # For copying files from another bucket
# session=boto3.session.Session(), # For custom aws session
# s3_client_kwargs={} # Use to pass arguments allowed by the s3 client: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
# delimiter="\n", # Will insert this delimiter between each file when concatenating. Warning, this will need to download all files no matter the size to add this delimiter
)
# Add files, can call multiple times to add files from other directories
job.add_files(path_to_concat)
# Add a single file at a time
job.add_file("some/file_key.json")
# Only use small_parts_threads if you need to. See Advanced Usage section below.
job.concat(small_parts_threads=4, main_threads=2)
Depending on your use case, you may want to use more threads then just 1.
main_threads
is the number of threads to use when uploading files to s3. This will help when there are a lot of files that are already over the min_file_size
that is set
small_parts_threads
is only used when the files you are trying to concat are less then 5MB. These are spawned from inside of the main_threads
. Due to the limitations of the s3 multipart_upload api (see Limitations below) any files less then 5MB need to be downloaded locally, concated together, then re uploaded. By setting this thread count it will download the parts in parallel for faster creation of the concatenation process.
The values set for these arguments depends on your use case and the system you are running this on.
This uses the multipart upload of s3 and its limits are https://docs.aws.amazon.com/AmazonS3/latest/dev/qfacts.html
FAQs
Concat files in s3
We found that s3-concat demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The Rust Security Response WG is warning of phishing emails from rustfoundation.dev targeting crates.io users.
Product
Socket now lets you customize pull request alert headers, helping security teams share clear guidance right in PRs to speed reviews and reduce back-and-forth.
Product
Socket's Rust support is moving to Beta: all users can scan Cargo projects and generate SBOMs, including Cargo.toml-only crates, with Rust-aware supply chain checks.