Skip to content
Advertisement

DynamoDb batchGetItem and Partition Key and Sort Key

I tried to use batchGetItem to return attributes of more then one item from a table but seems it works only with the combination of the partition key and range key, but what if I want to identify the requested items only by primary key ? is the only way is to create the table without the range key ?

    // Adding items
    $client->putItem(array(
        'TableName' => $table,
        'Item' => array(
            'id'     => array('S' => '2a49ab04b1534574e578a08b8f9d7441'),
            'name'   => array('S' => 'test1'),
            'user_name'   => array('S' => 'aaa.bbb')
        )
    ));

    // Adding items
    $client->putItem(array(
        'TableName' => $table,
        'Item' => array(
            'id'     => array('S' => '4fd70b72cc21fab4f745a6073326234d'),
            'name'   => array('S' => 'test2'),
            'user_name'   => array('S' => 'aaaa.bbbb'),
            'user_name1'   => array('S' => 'aaaaa.bbbbb')
        )
    ));

$client->batchGetItem(array(
    "RequestItems" => array(
        $table => array(
            "Keys" => array(
                // hash key
                array(
                    "id"  => array( 'S' => "2a49ab04b1534574e578a08b8f9d7441"),
                // range key
                    "name" => array( 'S' => "test1"),
                ),
                array(
                // hash key
                    "id"  => array( 'S' => "4fd70b72cc21fab4f745a6073326234d"),
                // range key
                    "name" => array( 'S' => "test2"),
                ),
            )
        )
    )
));

As per the official documentation:

http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.Partitions.html

If the table has a composite primary key (partition key and sort key), DynamoDB calculates the hash value of the partition key in the same way as described in Data Distribution: Partition Key—but it stores all of the items with the same partition key value physically close together, ordered by sort key value.

What are the advantages using Partition Key and Sort Key beside it stores all of the items with the same partition key value physically close together ?

As per the official documentation:

A single operation can retrieve up to 16 MB of data, which can contain as many as 100 items. BatchGetItem will return a partial result if the response size limit is exceeded, the table’s provisioned throughput is exceeded, or an internal processing failure occurs.

How to handle the request if I need more then 100 items ? just loop through all the items from the code and request each time 100 times or there is another way to achieve it via the AWS SDK DynamoDB?

Example of table creation:

$client->createTable(array(
        'TableName' => $table,
        'AttributeDefinitions' => array(
            array(
                'AttributeName' => 'id',
                'AttributeType' => 'N'      
            ),
            array(
                'AttributeName' => 'name',
                'AttributeType' => 'S'
            )
        ),
        'KeySchema' => array(
            array(
                'AttributeName' => 'id',
                'KeyType'       => 'HASH'
            ),
            array(
                'AttributeName' => 'name',
                'KeyType'       => 'RANGE'
            )
        ),
        'ProvisionedThroughput' => array(
            'ReadCapacityUnits'  => 5,
            'WriteCapacityUnits' => 5
        )
    ));

Thanks

UPDATE – Question to Mark B answer:

Yes you can create an index without a range key. The range key is entirely optional. However, even if you have a range key defined it is optional to include it in your query. You can simply specify the hash key in your query to get all items with the hash key, which will be returned in an order based on the range key.

If I specify only the hash key in my query on a table with hash key and range key, I getting the below error, if I specify only the hash key in my query on a table without range key it works. Please note the table without index.

An uncaught Exception was encountered

Type:        AwsDynamoDbExceptionDynamoDbException
Message:     Error executing "BatchGetItem" on "https://dynamodb.eu-central-1.amazonaws.com"; AWS HTTP error: Client error: `POST https://dynamodb.eu-central-1.amazonaws.com` resulted in a `400 Bad Request` response:
{"__type":"com.amazon.coral.validate#ValidationException","message":"The provided key element does not match the schema" (truncated...)
 ValidationException (client): The provided key element does not match the schema - {"__type":"com.amazon.coral.validate#ValidationException","message":"The provided key element does not match the schema"}
Filename:    /var/app/vendor/aws/aws-sdk-php/src/WrappedHttpHandler.php

Advertisement

Answer

but what if I want to identify the requested items only by primary key ? is the only way is to create the table without the range key ?

Yes you can create an index without a range key. The range key is entirely optional. However, even if you have a range key defined it is optional to include it in your query. You can simply specify the hash key in your query to get all items with the hash key, which will be returned in an order based on the range key.

What are the advantages using Partition Key and Sort Key beside it stores all of the items with the same partition key value physically close together ?

The two fields combined are your primary key, which guarantees uniqueness. The range/sort key also determines the order that results are returned in.

How to handle the request if I need more then 100 items ?

From the documentation (emphasis mine):

The maximum number of item attributes that can be retrieved for a single operation is 100. Also, the number of items retrieved is constrained by a 1 MB the size limit. If the response size limit is exceeded or a partial result is returned due to an internal processing failure, Amazon DynamoDB returns an UnprocessedKeys value so you can retry the operation starting with the next item to get.

For example, even if you ask to retrieve 100 items, but each individual item is 50k in size, the system returns 20 items and an appropriate UnprocessedKeys value so you can get the next page of results. If necessary, your application needs its own logic to assemble the pages of results into one set.

So you would need to check the UnprocessedKeys value of the result and continue making requests in your application until there are no more UnprocessedKeys.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement