0

My DynamoDB table has around 100 million (30GB) items and I provisioned it with 10k RCUs. I'm using a data pipeline job to export the data.

The DataPipeline Read Throughput Ratio set to 0.9.

How do I calculate the time for the export to be completed (The pipeline is taking more than 4 hrs to complete the export)

How can I optimize this, so that export completes in less time.

How does the Read Throughput Ratio relate to DynamoDB export?

Maurice
  • 5,246
  • 2
  • 16
  • 29
Harika K
  • 1
  • 1
  • If you have point in time recovery activated, there is a much easier solution available now, see the [news blog](https://aws.amazon.com/blogs/aws/new-export-amazon-dynamodb-table-data-to-data-lake-amazon-s3/) – Maurice Mar 09 '21 at 18:14

1 Answers1

0

The answer to this question addresses most of your questions in regard to estimating the time for the Data Pipeline job to complete.

There is now a much better solution to export data from DynamoDB to S3, which was announced in November 2020. There is now a way to do that from DynamoDB directly without provisioning an EMR Cluster and tons of RCUs.

Check out the documentation for: Exporting DynamoDB table data to Amazon S3

Maurice
  • 5,246
  • 2
  • 16
  • 29