How to implement Slack’s cursor based pagination spec with DynamoDB

Here’s how I implemented Slack’s cursor based pagination specs with DynamoDB, but also included traveling backwards.

Rcls
5 min readFeb 10, 2021

Recently I got to work with DynamoDB once again. The service I had to build was rather simple: fetching data from a third party API and using DynamoDB as a cache. If we get a cache hit for the same type of request, we could simply return that and not have to request it from the third party.

For this service we had two identifiable access patterns:

  • A user can request data for a single entity and in this case we used a specific request identifier as the partition key, and request timestamp as a sort key. We also stored the UserID as an attribute.
  • A user can request a collection of all the previous requests he made (to a certain point in time, let’s say max. 6 months back), with the data attached, and for this I had to enable pagination.

To enable the second access pattern I created a Global Secondary Index (GSI) where the UserID attribute acts as the partition key, while timestamp still remained as the sort key. This way I could query all the data for a user while using the timestamp for sorting and pagination.

I had previous experience implementing cursor based pagination with GraphQL, but some time had passed since then. I had to jog my memory and read the specification Relay has published for it, again.

As I browsed through the specs again I realized how complex it was. Relay’s specification uses terms such as Edge, Node and PageInfo. They also separate the browsing direction with a different set of parameters (before and after, first and last).

I started looking for other solutions and came across Slack’s evolution of cursor based pagination. The more detailed specification can be found in their API docs, which was relatively easy to follow. In short what Slack did was, they removed the ability to browse backwards, the concept of edges, nodes and page info, and only make two parameters available for the request. They also only return the next cursor in the response metadata object.

The specification makes implementing cursor based pagination simpler. You know what direction the client wants to head into (or in Slack’s case force it), and you return the cursor for the next page. You also don’t have to structure your response with the idea of edges and nodes, while also adding a cursor to each object.

I did need backwards browsing though, and I implemented it in a very simple manner: telling the API which direction I was traveling in the collection.

Specification

I find it much easier to start with the specifications instead of implementation, when one is available. Often when people talk about cursor based pagination, they link to Relay’s documentation, but in this instance I will refer you to Slack’s API docs with minor modifications and additions:

  • Cursor-paginated methods accept cursor, limit and dir parameters.
  • If you don’t pass a dir parameter, the default travel direction will be forward
  • dir parameter only accepts the following values: forward or backwards Using forward, results will be returned in descending order. Passing dir value as backwards will return results in ascending order.

You can modify this to suit your needs. For instance, using the strings forward and backwards can be too much and you can just use desc: bool or just letters F or B. I recommend you come up with a way you or your team is most comfortable with.

The returned response for this is as follows and the only item worth noting is the response_metadata object and the next_cursor property. The next_cursor is a base64 encoded value of the timestamp from the last item in the collection.

{
"results": [
{
"UUID": "3b5d50c3-8df7-43f5-b2cf-73a265e435f9",
"UserId": "1238932",
"Timestamp": "2020-01-01T14:48:00.000Z"
"Data": {}
},
{
"UUID": "d86a3f0c-9aeb-4ee9-9bd7-c90d86b348c8",
"UserId": "1238932",
"Timestamp": "2020-01-02T14:48:00.000Z"
"Data": {}
},
],
"response_metadata": {
"next_cursor": "MjAyMC0wMS0wMlQxNDo0ODowMC4wMDBa"
}
}

Using the words forward and backwards has a somewhat funny meaning in this scenario, once you realize that the sort key is a timestamp. So when you’re moving forward, you’re actually moving backwards in time, from newest item, to the oldest. When moving backwards, it’s the opposite. I simply associated the words with browsing forward in your collection, where the default sorting is done in a descending order, from newest to oldest as it most often is the default. You can change this if you like or just do what I do. Just don’t overthink it.

Implementation

Let it be noted that in the example below I use the DocumentClient from the Node.js SDK of DynamoDB inside a Lambda function, and more specifically the query() method. I will not go into detail on how to model your DynamoDB table. That’s up to you.

Implementing the pagination itself is rather simple and follows these rules.

  • By default, the sort key is omitted from the KeyConditionExpression, when no cursor is provided.
  • If dir is omitted,the default sort key in KeyConditionExpression will be #timestamp > :cursor since we travel forward from newest to oldest result.
  • When dir value is provided, depending on the travel direction, the expression for sorting will either be #timestamp > :cursor (forward) or #timestamp < :cursor (backwards)
  • The ScanIndexForward option will be set to false when dir is forward, to return the collection in descending order (newest to oldest). The value is true for backwards browsing and the collection is returned in ascending order (oldest to newest).
  • Set limit to limit + 1 so you know if you need to set a next_cursor value in the response, if the Count returned by `query()` is greater than requested limit.

Small piece of code as an example:

const isEmpty = require('lodash.isempty');async function queryCollection(args) {
const { userId, cursor, dir, limit } = args;

let expression = ‘UserID = :userId’;
let expressionAttributeNames = {};

let expressionAttributeValues = {
‘:userId’: userId
};

if (cursor) {
// Set cursor
expressionAttributeValues[‘:cursor’] = new Buffer(cursor, ‘base64’).toString(‘ascii’);
// Adjust expression to contain sorting based on dir
expression = `userId = :userId AND #timestamp ${dir === ‘backwards’ ? ‘>’ : ‘<’} :cursor`;
// Set expression attribute name since we use sort key
expressionAttributeNames[‘#timestamp’] = ‘timestamp’;
}

const queryArgs = {
TableName: this.table,
IndexName: ‘userIdIndex’,
KeyConditionExpression: expression,
ExpressionAttributeValues: expressionAttributeValues,
Limit: (limit + 1),
ScanIndexForward: (dir === ‘backwards’)
};

if (!isEmpty(expressionAttributeNames)) {
queryArgs.ExpressionAttributeNames = expressionAttributeNames;
}

return this.ddb.query(queryArgs).promise();
}

In this example the function is inside a class, which is why you might notice me using instance properties like this.table inside of it. The returned value is a Promise. I’m not using callbacks here.

Wrapping up

That’s pretty much it! Very simple to understand and execute now that concepts like edges, nodes and pageInfo are out the door. Now you can browse through the pages of your collection by simply providing a limit, an optional cursor and a direction of travel.

GET /collection?limit=100&cursor=MjAyMC0wMS0wMlQxNDo0ODowMC4wMDBa&dir=forward

--

--

Rcls

Consultant, software architect and developer, freelance UI/UX designer, computer engineer, tech enthusiast, father.