aws-dynamodb-parallel-scan
Amazon DynamoDB parallel scan paginator for boto3.
Installation
Install from PyPI with pip
pip install aws-dynamodb-parallel-scan
or with the package manager of choice.
Usage
The library is a drop-in replacement for boto3 DynamoDB Scan Paginator. Example:
import aws_dynamodb_parallel_scan
import boto3
client = boto3.resource("dynamodb").meta.client
paginator = aws_dynamodb_parallel_scan.get_paginator(client)
for page in paginator.paginate(TableName="mytable", TotalSegments=5):
items = page.get("Items", [])
Notes:
-
paginate()
accepts the same arguments as boto3 DynamoDB.Client.scan()
method. Arguments
are passed to DynamoDB.Client.scan()
as-is.
-
paginate()
uses the value of TotalSegments
argument as parallelism level. Each segment
is scanned in parallel in a separate thread.
-
paginate()
yields DynamoDB Scan API responses in the same format as boto3
DynamoDB.Client.scan()
method.
See boto3 DynamoDB.Client.scan() documentation
for details on supported arguments and the response format.
CLI
This package also provides a CLI tool (aws-dynamodb-parallel-scan
) to scan a DynamoDB table
with parallel scan. The tool supports all non-deprecated arguments of DynamoDB Scan API. Execute
aws-dynamodb-parallel-scan -h
for details
Here's some examples:
$ aws-dynamodb-parallel-scan --table-name mytable
{"Items": [...], "Count": 10256, "ScannedCount": 10256, "ResponseMetadata": {}}
{"Items": [...], "Count": 12, "ScannedCount": 12, "ResponseMetadata": {}}
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5
{"Items": [...], "Count":32, "ScannedCount":32, "ResponseMetadata": {}}
{"Items": [...], "Count":47, "ScannedCount":47, "ResponseMetadata": {}}
{"Items": [...], "Count":52, "ScannedCount":52, "ResponseMetadata": {}}
{"Items": [...], "Count":34, "ScannedCount":34, "ResponseMetadata": {}}
{"Items": [...], "Count":40, "ScannedCount":40, "ResponseMetadata": {}}
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--output-items
{"pk": {"S": "item1"}, "quantity": {"N": "99"}}
{"pk": {"S": "item24"}, "quantity": {"N": "25"}}
...
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--output-items --use-document-client
{"pk": "item1", "quantity": 99}
{"pk": "item24", "quantity": 25}
...
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--filter-expression "quantity < :value" \
--expression-attribute-values '{":value": {"N": "5"}}' \
--output-items
{"pk": {"S": "item142"}, "quantity": {"N": "4"}}
{"pk": {"S": "item874"}, "quantity": {"N": "1"}}
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--filter-expression "quantity < :value" \
--expression-attribute-values '{":value": 5}' \
--use-document-client --output-items
{"pk": "item142", "quantity": 4}
{"pk": "item874", "quantity": 1}
Development
Requires Python 3 and uv. Useful commands:
make test
make -k lint
make format
License
MIT
Credits