Showing posts from May, 2017

Importing data to dynamoDB from S3 (using AWS Data Pipeline)

You will have to have an S3 location first, let's say a directory 'X'.
The directory 'X' from which the import will happen should contain below files: a.manifest b.your-file-here.txt (the one containing the actual data)
your-file-here.txt will contain the data in JSON format, one per line.
Go to dynamoDB, select your table by clicking on it. Under 'Actions', hit 'import data'. Create a pipeline and activate it, but before activating, consider below learnings about your data.
Learnings when importing data to DynamoDB (from S3 file, using data pipeline): 1.Replace \ with \\ 2.No field value should be empty 3.Each line should independently be a valid json object. Any line should NOT end in a comma. 4.The file should be JSON verified using bash command: cat <file-name> | python -m json.tool
Note that the file may need to be converted to a full json object first, by appending comma at the end of each line, and appending {“object”: [ at the beginning of the…