This is just a quick note about the advantage of using DynamoDB's newly introduced BatchWriteItem functionality, which allows you to write multiple items at the same time to a table, with the write operation parallelized behind the scenes by DynamoDB. Currently there is a limit of 25 items that can be batch-written or batch-deleted to/from a DynamoDB table.
I was glad to see that the boto library already supports this new feature -- the fact that Mitch Garnaat is now an employee of Amazon probably helps too ;-) You do have to git pull the latest boto code from GitHub, since BatchWriteItem is not available in the latest boto release 2.3.0.
I tested this feature inside a script which was parsing mail logs and uploading lines corresponding to certain regular expressions as items to a DynamoDB table. When I used the standard item-at-a-time method, it took 7 hours to write 2 million items into the table. When using BatchWriteItem, it only took 26 minutes -- so a 16x improvement.
Here's how I used this new functionality with boto:
1) I created a DynamoDB connection object and a table object:
dynamodb_conn = boto.connect_dynamodb(aws_access_key_id=MY_ACCESS_KEY_ID, aws_secret_access_key=MY_SECRET_ACCESS_KEY)
mytable = dynamodb_conn.get_table('mytable')
2) I created a batch_list object:
batch_list = dynamodb_conn.new_batch_write_list()
3) I populated this object with a list of DynamoDB items:
where items is a Python list containing item objects obtained via
4) I used the batch_write_item of the layer2 module in boto to write the batch list:
That was about it. I definitely recommend using BatchWriteItem whenever you can, for the speedup it provides.
One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...
Here's a good interview question for a tester: how do you define performance/load/stress testing? Many times people use these terms inte...
I know the title of this post doesn't make much sense, I wrote it that way so that people who run into issues similar to mine will have ...
Gatling is a modern load testing tool written in Scala. As part of the Jenkins setup I am in charge of , I wanted to run load tests using Ga...