@cumulus/checksum
Advanced tools
Changelog
[v12.0.0] 2022-05-20
CUMULUS-2903
cumuluss/cumulus-ecs-task
Docker image must be updated to
cumuluss/cumulus-ecs-task:1.8.0
. This can be done by updating the image
property of any tasks defined using the cumulus_ecs_service
Terraform
module.CUMULUS-2932
SyncGranule
task to include disableOrDefaultAcl
function that uses
the configuration ACL parameter to set ACL to private by default or disable ACL.@cumulus/sync-granule
download()
function to take in ACL parameter@cumulus/ingest
proceed()
function to take in ACL parameter@cumulus/ingest
addLock()
function to take in an optional ACL parameterSyncGranule
example worfklow config
example/cumulus-tf/sync_granule_workflow.asl.json
to include ACL
parameter.Changelog
[v11.1.1] 2022-04-26
@cumulus/aws-client
to use new AWS SDK v3 packages for S3 requests:
@aws-sdk/client-s3
@aws-sdk/lib-storage
@aws-sdk/s3-request-presigner
@cumulus/aws-client
and AWS SDK v3 S3 packages:
@cumulus/api
@cumulus/async-operations
@cumulus/cmrjs
@cumulus/common
@cumulus/collection-config-store
@cumulus/ingest
@cumulus/launchpad-auth
@cumulus/sftp-client
@cumulus/tf-inventory
lambdas/data-migration2
tasks/add-missing-file-checksums
tasks/hyrax-metadata-updates
tasks/lzards-backup
tasks/sync-granule
@cumulus/aws-client
to use new AWS SDK v3 packages for API Gateway requests:
@aws-sdk/client-api-gateway
@cumulus/example-lib
package to example project to allow unit tests example/script/lib
dependency.audit-ci
v6.2.0Changelog
[v11.1.0] 2022-04-07
11.1.0 is an amendment release and supersedes 11.0.0. However, follow the migration steps for 11.0.0.
CUMULUS-2905
migrateAndOverwrite
and
migrateOnlyFiles
options.skipMetadataValidation
to hyrax-metadata-updates
tasklast_modified_date
as output to all tasks in Terraform ingest
module.deployment/choosing_configuring_rds
.ORCA Backup
reconciliation report to report cumulusFilesCount
and orcaFilesCount
@cumulus/aws-client
to use new AWS SDK v3 packages for DynamoDB requests:
@aws-sdk/client-dynamodb
@aws-sdk/lib-dynamodb
@aws-sdk/util-dynamodb
@cumulus/api
@cumulus/errors
@cumulus/tf-inventory
lambdas/data-migration2
packages/api/ecs/async-operation
@cumulus/cmr-client/ingestUMMGranule
and @cumulus/cmr-client/ingestConcept
functions to not perform separate validation requesthello_world_service
module to pass in lastModified
parameter in command list to trigger a Terraform state change when the hello_world_task
is modified.@cumulus/aws-client
Changelog
[v11.0.0] 2022-03-24 [STABLE]
Release v11.0 is a maintenance release series, replacing v9.9. If you are upgrading to or past v11 from v9.9.x to this release, please pay attention to the following migration notes from prior releases:
data-persistence
module, but before deploying the main cumulus
module/rules/<name>
endpoint, the rule records in PostgreSQL may be
out of sync with records in DynamoDB. In order to bring the records into sync, re-deploy and re-run the
data-migration1
Lambda with a payload of
{"forceRulesMigration": true}
:aws lambda invoke --function-name $PREFIX-data-migration1 \
--payload $(echo '{"forceRulesMigration": true}' | base64) $OUTFILE
cumulus
deploymenttask-config
for all workflows that use the sync-granule
task to include workflowStartTime
set to
{$.cumulus_meta.workflow_start_time}
. See here for an example.cumulus
deploymentAs part of the work on the RDS Phase 2 feature, it was decided to re-add the
granule file type
property on the file table (detailed reasoning
https://wiki.earthdata.nasa.gov/pages/viewpage.action?pageId=219186829). This
change was implemented as part of CUMULUS-2672/CUMULUS-2673, however granule
records ingested prior to v11 will not have the file.type property stored in the
PostGreSQL database, and on installation of v11 API calls to get granule.files
will not return this value. We anticipate most users are impacted by this issue.
Users that are impacted by these changes should re-run the granule migration lambda to only migrate granule file records:
PAYLOAD=$(echo '{"migrationsList": ["granules"], "granuleMigrationParams": {"migrateOnlyFiles": "true"}}' | base64)
aws lambda invoke --function-name $PREFIX-postgres-migration-async-operation \
--payload $PAYLOAD $OUTFILE
You should note that this will only move files for granule records in PostgreSQL. If you have not completed the phase 1 data migration or have granule records in dynamo that are not in PostgreSQL, the migration will report failure for both the DynamoDB granule and all the associated files and the file records will not be updated.
If you prefer to do a full granule and file migration, you may instead
opt to run the migration with the migrateAndOverwrite
option instead, this will re-run a
full granule/files migration and overwrite all values in the PostgreSQL database from
what is in DynamoDB for both granules and associated files:
PAYLOAD=$(echo '{"migrationsList": ["granules"], "granuleMigrationParams": {"migrateAndOverwrite": "true"}}' | base64)
aws lambda invoke --function-name $PREFIX-postgres-migration-async-operation \
--payload $PAYLOAD $OUTFILE
Please note: Since this data migration is copying all of your granule data
from DynamoDB to PostgreSQL, it can take multiple hours (or even days) to run,
depending on how much data you have and how much parallelism you configure the
migration to use. In general, the more parallelism you configure the migration
to use, the faster it will go, but the higher load it will put on your
PostgreSQL database. Excessive database load can cause database outages and
result in data loss/recovery scenarios. Thus, the parallelism settings for the
migration are intentionally set by default to conservative values but are
configurable. If this impacts only some of your data products you may want
to consider using other granuleMigrationParams
.
Please see the second data migration docs for more on this tool if you are unfamiliar with the various options.
ORCA Backup
is now a supported reportType
for the POST /reconciliationReports
endpoint@cumulus/message/utils.parseException
to parse exception objects@cumulus/message/Granules
:
getGranuleProductVolume
getGranuleTimeToPreprocess
getGranuleTimeToArchive
generateGranuleApiRecord
@cumulus/message/PDRs/generatePdrApiRecordFromMessage
to generate PDR from Cumulus workflow message@cumulus/es-client/indexer
:
deleteAsyncOperation
to delete async operation records from ElasticsearchupdateAsyncOperation
to update an async operation record in ElasticsearchPUT
endpoint to Cumulus API for updating a granule.
Requests to this endpoint should be submitted without an action
attribute in the request body.@cumulus/api-client/granules.updateGranule
to update granule via the API@cumulus/db/translate/providers
@cumulus/db/translate/collections
searchWithUpdatedAtRange
method to
@cumulus/db/models/collections
deletePdr
to @cumulus/api-client/pdrs
move
action to update granules in the index
and utilize postgres as the authoritative datastore@cumulus/db/base.deleteExcluding
method to allow for deletion of a
record set with an exclusion list of cumulus_ids@cumulus/db/getFilesAndGranuleInfoQuery()
to build a query for searching file
records in PostgreSQL and return specified granule information for each file@cumulus/db/QuerySearchClient
library to handle sequentially fetching and paging
through results for an arbitrary PostgreSQL queryinsert
method to all @cumulus/db
models to handle inserting multiple records into
the database at once@cumulus/db/translatePostgresGranuleResultToApiGranule
helper to
translate custom PostgreSQL granule result to API granuletype
text column to Postgres database files
table@cumulus/es-client/indexer.upsertExecution
to upsert an execution@cumulus/es-client/indexer.upsertPdr
to upsert a PDR@cumulus/es-client/indexer.upsertGranule
to upsert a granuleexecution_sns_topic_arn
environment variable to
sf_event_sqs_to_db_records
lambda TF definition.sf_event_sqs_to_db_records_lambda
IAM policy to include
permissions for SNS publish for report_executions_topic
collection_sns_topic_arn
environment variable to
PrivateApiLambda
and ApiEndpoints
lambdas.updateCollection
to @cumulus/api-client
.ecs_cluster
IAM policy to include permissions for SNS publish
for report_executions_sns_topic_arn
, report_pdrs_sns_topic_arn
,
report_granules_sns_topic_arn
process_dead_letter_archive.tf
bulk_operation.tf
pdr_sns_topic_arn
environment variable to
sf_event_sqs_to_db_records
lambda TF definition.publishSnsMessageByDataType
in @cumulus/api
to
publish SNS messages to the report topics to PDRs, Collections, and
Executions.publishSnsMessageUtils
to handle
publishing SNS messages for specific data and event types:
publishCollectionUpdateSnsMessage
publishCollectionCreateSnsMessage
publishCollectionDeleteSnsMessage
publishGranuleUpdateSnsMessage
publishGranuleDeleteSnsMessage
publishGranuleCreateSnsMessage
publishExecutionSnsMessage
publishPdrSnsMessage
publishGranuleSnsMessageByEventType
ecs_cluster
IAM policy to include permissions for SNS publish
for report_executions_topic
and report_pdrs_topic
.paginateByCumulusId
to @cumulus/db
BasePgModel
to allow for paginated
full-table select queries in support of elasticsearch indexing.getMaxCumulusId
to @cumulus/db
BasePgModel
to allow all
derived table classes to support querying the current max cumulus_id
.ES_HOST
environment variable to postgres-migration-async-operation
Lambda using value of elasticsearch_hostname
Terraform variable.elasticsearch_security_group_id
to security groups for
postgres-migration-async-operation
lambda.DynamoDb:DeleteItem
to
postgres-migration-async-operation
lambda.async_operation_image
in
tf-modules/cumulus/variables.tf
to cumuluss/async-operation:41
ES_HOST
environment variable to async operation ECS task
definition to ensure that async operation tasks write to the correct
Elasticsearch domain@cumulus/api/lambdas/reports/orca-backup-reconciliation-report
to create
ORCA Backup
reconciliation reportdbIndexer
Lambda for DynamoDB tables:
<prefix>-AsyncOperationsTable
<prefix>-CollectionsTable
<prefix>-ExecutionsTable
<prefix>-GranulesTable
<prefix>-PdrsTable
<prefix>-ProvidersTable
<prefix>-RulesTable
@ingest/granule.moveGranuleFiles
waitForModelStatus
from example/spec/helpers/apiUtils
integration test helpersstream_enabled
and stream_view_type
from executions_table
TF
definition.aws_lambda_event_source_mapping
TF definition on executions
DynamoDB table.stream_enabled
and stream_view_type
from collections_table
TF definition.aws_lambda_event_source_mapping
TF definition on collections
DynamoDB table.publish_collections
TF resource.aws_lambda_event_source_mapping
TF definition on granulesstream_enabled
and stream_view_type
from pdrs_table
TF
definition.aws_lambda_event_source_mapping
TF definition on PDRs
DynamoDB table.@cumulus/api/models/granules.storeGranulesFromCumulusMessage()
methodaddToLocalES
in POST /granules
endpoint since it is
redundant.addToLocalES
in POST and PUT /executions
endpoints
since it is redundant.addToLocalES
from es-client
package since it is no
longer used._updateGranuleStatus
to update granule to "running" from @cumulus/api/lib/ingest.reingestGranule
and @cumulus/api/lib/ingest.applyWorkflow
@cumulus/common
to address CVE-2022-2477CUMULUS_2641
@cumulus/message
package to set productVolume as string
(calculated with file.size
as a BigInt
) to match API schema@cumulus/db
granule translation to translate granule
objects to
match the updated API schemaCUMULUS-2714
CUMULUS-2672
data-migration2
lambda to migrate Dynamo granule.files[].type
instead of dropping it.@cumlus/db
translateApiFiletoPostgresFile
to retain type
@cumulus/db
translatePostgresFileToApiFile
to retain type
@cumulus/types.api.file
to add type
to the typing.CUMULUS-2315
index-from-database
lambda/ECS task and elasticsearch endpoint to read
from PostgreSQL databaseindex-from-database
endpoint to add the following configuration
tuning parameters:
CUMULUS-2308
/granules/<granule_id>
GET endpoint to return PostgreSQL Granules instead of DynamoDB Granules/granules/<granule_id>
PUT endpoint to use PostgreSQL Granule as source rather than DynamoDB GranuleunpublishGranule
(used in /granules PUT) to use PostgreSQL Granule as source rather than DynamoDB GranulewaitForApiStatus
instead of waitForModelStatus
CUMULUS-2302
reingest
endpoint to read collection from PostgreSQL
database instead of DynamoDB@cumulus/api
unit helper from ./lib
to
.test/helpers
CUMULUS-2208
@cumulus/api/es/*
code to new @cumulus/es-client
packagesfEventSqsToDbRecords
Lambda now writes following data directly to
Elasticsearch in parallel with writes to DynamoDB/PostgreSQL:
packages/api/lib/granules.getGranuleProductVolume
->
@cumulus/message/Granules.getGranuleProductVolume
packages/api/lib/granules.getGranuleTimeToPreprocess
-> @cumulus/message/Granules.getGranuleTimeToPreprocess
packages/api/lib/granules.getGranuleTimeToArchive
->
@cumulus/message/Granules.getGranuleTimeToArchive
packages/api/models/Granule.generateGranuleRecord
-> @cumulus/message/Granules.generateGranuleApiRecord
CUMULUS-2306
api/bin/serve.js
) setup code to add cleanup/executions
related recordswaitForModelStatus
CUMULUS-2303
CUMULUS-2301
getAsyncOperation
to read from PostgreSQL database instead of
DynamoDB.translatePostgresAsyncOperationToApiAsyncOperation
function in
@cumulus/db/translate/async-operation
.translateApiAsyncOperationToPostgresAsyncOperation
function to
ensure that output
is properly translated to an object for the
PostgreSQL record for the following cases of output
on the incoming API
record:
record.output
is a JSON stringified objectrecord.output
is a JSON stringified arrayrecord.output
is a JSON stringified stringrecord.output
is a stringCUMULUS-2317
CUMULUS-2304
CUMULUS-2634
sfEventSqsToDbRecords
Lambda to use new upsert helpers for executions, granules, and PDRs
to ensure out-of-order writes are handled correctly when writing to ElasticsearchCUMULUS-2510
@cumulus/api/lib/writeRecords/write-execution
to publish SNS
messages after a successful write to Postgres, DynamoDB, and ES.create
and upsert
in the db
model for Executions
to return an array of objects containing all columns of the created or
updated records.@cumulus/api/endpoints/collections
to publish an SNS message
after a successful collection delete, update (PUT), create (POST).create
and upsert
in the db
model for Collections
to return an array of objects containing all columns for the created or
updated records.create
and upsert
in the db
model for Granules
to return an array of objects containing all columns for the created or
updated records.@cumulus/api/lib/writeRecords/write-granules
to publish SNS
messages after a successful write to Postgres, DynamoDB, and ES.@cumulus/api/lib/writeRecords/write-pdr
to publish SNS
messages after a successful write to Postgres, DynamoDB, and ES.CUMULUS-2733
_writeGranuleFiles
function creates an aggregate error which
contains the workflow error, if any, as well as any error that may occur
from writing granule files.CUMULUS-2674
DELETE
endpoints for the following data types to check that record exists in
PostgreSQL or Elasticsearch before proceeding with deletion:
provider
async operations
collections
granules
executions
PDRs
rules
CUMULUS-2294
CUMULUS-2642
collectionIds
, granuleIds
, and providers
to allow
targeting/filtering of the results.CUMULUS-2694
sfEventSqsToDbRccords
to log message if Cumulus
workflow message is from pre-RDS deployment but still attempt parallel writing to DynamoDB
and PostgreSQLsfEventSqsToDbRccords
to throw error if requirements to write execution to PostgreSQL cannot be metCUMULUS-2660
/executions
endpoint to publish SNS message of created record to executions SNS topicCUMULUS-2661
/executions/<arn>
endpoint to publish SNS message of updated record to executions SNS topicCUMULUS-2765
updateGranuleStatusToQueued
in write-granules
to write to
Elasticsearch and publish SNS message to granules topic.CUMULUS-2774
constructGranuleSnsMessage
and constructCollectionSnsMessage
to throw error if eventType
is invalid or undefined.CUMULUS-2776
getTableIndexDetails
in db-indexer
to use correct
deleteFnName
for reconciliation reports.CUMULUS-2780
CUMULUS-2778
async_operation_image
in tf-modules/cumulus/variables.tf
to cumuluss/async-operation:38
CUMULUS-2854
createRuleTrigger
from create
.rulesModel.createRuleTrigger
directly to create rule trigger.rulesModel.createRuleTrigger
if update fails and reversion needs to occur.GET /executions/status
responseChangelog
[v10.1.2] 2022-03-11
postgres-db-migration
lambda timeout to default 900 secondsdb_migration_lambda_timeout
variable to data-persistence
module to
allow this timeout to be user configurableiam:PassRole
permission to step_policy
in tf-modules/ingest/iam.tf
Changelog
[v10.1.1] 2022-03-04
/rules/<name>
endpoint, the rule records in PostgreSQL may be
out of sync with records in DynamoDB. In order to bring the records into sync, re-run the
previously deployed data-migration1
Lambda with a payload of
{"forceRulesMigration": true}
:aws lambda invoke --function-name $PREFIX-data-migration1 \
--payload $(echo '{"forceRulesMigration": true}' | base64) $OUTFILE
CUMULUS-2841
CUMULUS-2846
@cumulus/db/translate/rule.translateApiRuleToPostgresRuleRaw
to translate API rule to PostgreSQL rules and
keep undefined fieldsCUMULUS-NONE
CUMULUS-2845
createRuleTrigger
from create
.rulesModel.createRuleTrigger
directly to create rule trigger.rulesModel.createRuleTrigger
if update fails and reversion needs to occur.CUMULUS-2846
localstack/localstack
used in local unit testing to 0.11.5
/rules
endpoint causing rule records to be created
inconsistently in DynamoDB and PostgreSQLPUT /rules/<name>
endpoint causing rules to be saved
inconsistently between DynamoDB and PostgreSQLChangelog
[v10.1.0] 2022-02-23
tf-modules/rds-cluster-tf
. The allowed parameters for the parameter group can be found in the AWS documentation of allowed parameters for an Aurora PostgreSQL cluster. By default, the following parameters are specified:
shared_preload_libraries
: pg_stat_statements,auto_explain
log_min_duration_statement
: 250
auto_explain.log_min_duration
: 250
granule_cumulus_id
to the RDS files table.shortName____
timeout_action
to ForceApplyCapacityChange
by default for the RDS serverless database cluster tf-modules/rds-cluster-tf
<prefix>-steprole
in tf-modules/ingest/iam.tf
to address the
Error: error creating Step Function State Machine (xxx): AccessDeniedException: 'arn:aws:iam::XXX:role/xxx-steprole' is not authorized to create managed-rule
error in non-NGAP accounts:
events:PutTargets
events:PutRule
events:DescribeRule