~ 3 min read
Pagination in CouchDB Apps
I’ve been working on some fun little node.js / couchdb projects of late. Given the fact I don’t use either as part of my work, I’ve spent some downtime experimenting and slowly iterating my approaches as I learn best practice.
I hit what I consider to be a fairly frustrating hurdle that couchdb threw up that I’ve been blissfully unaware of through all my couchdb dev. When it came to doing pagination it turns out I’ve always been doing it the “bad” way. Oh, well that’s upsetting.
The Wrong Way
My “slow” approach has always been to take the page no as a argument in the url, generating “skip” and “limit” variables to be used as parameters to my store. So for example, if I wanted to have the 2nd page of my app showing 10 items:
var skip = (pageno==1) ? () : ((pageno-1) * 10);
curl -X GET http://127.0.0.1:5984/stuff/_design/stuff/_view/by-name?skip=10&limit=10
It turns out, that although you might think you’re starting at a particular result, CouchDB still starts at the first result, due to the way the view is created from the b-tree index, couchDB just surpresses the results you skip. This isn’t good news when you’re trying to skip say, 10000 results.
The Suggested Way
The suggested solution is to perform requests and instead of using a “skip” parameter, keep track of the startkey at which the next page begins. This is possible, by requesting a page 1 item longer than that of the number of items on a page and using the key of the result in any requests. So now, for a first page my query is:
curl -X GET http://127.0.0.1:5984/stuff/_design/stuff/_view/by-name?limit=11
Returning something like:
{"total_rows":17,"offset":0,"rows":[
{"id":"8177bf155b952652129836a5d354b30e","key":"Ian Wootten","value":null},
{"id":"bae2c490c70480aec7096d79e1e3bfc3","key":"Isambard Kingdom Brunel","value":null},
{"id":"eaae74cfbe5cd13ea6b50dfd090827ca","key":"Christopher Columbus","value":null},
{"id":"491e68b08d73256f060ebf4b8e063e1c","key":"Elizabeth Fry","value":null},
{"id":"b45d8a7b9edee9ca66ac0860196f4504","key":"Edward Jenner","value":null},
{"id":"8a4d3f46885701ffcc7532aeac7a5ae9","key":"Florence Nightingale","value":null},
{"id":"71e6534c17429eca2cd9450cfc95c6bb","key":"Samuel Pepys","value":null},
{"id":"6cbad847f0ae959b281b471a72d60587","key":"Pocahontas","value":null},
{"id":"e5b026ec5c92c20f1575a2901defe14e","key":"Mary Seacole","value":null},
{"id":"84a371a7b8414237fad1b6aaf68cd16a","key":"George Stephenson","value":null},
{"id":"321aeb36e20d62660eb0d03c9fcd27b2","key":"Joe Bloggs","value":null}
]}
From the 11th returned result, I have the key “Joe Bloggs” – which can be used as a startkey arg to couch to obtain my second page. If we have duplicate keys, it is also neccessary to keep tabs on the last document’s id and supply as a startkey_docid arg in order to correctly page through everything.
What personally I dislike about the suggested approach, is the inability to create simple requests to arbritrary pages, even with low numbers. We always need to follow a path of links from the first page in order to view particular results. CouchDB’s response is “Not even Google is doing that!”, which is kind of weak to me. I want nice clean urls ala myapp.com/page/2 or myapp.com?page=2.
In fact, such a suggested approach only really allows us to have a single “more” type link in order to fetch results. Passing a startkey as part of a url param eg /page/321aeb36e20d62660eb0d03c9fcd27b2 just sounds (and looks) plain nasty and isn’t very good from a UX point of view for any users we may have.
At the moment, clean tangible page urls (the right way) are only possible using custom middleware. I’ve yet to find anything suitable for node.js. I intend to investigate how to cache document keys for low numbered pages as a separate db in order to produce a solution for my current project and I hope to write a later post detailing how I’ve got on.