incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Peters <cassan...@softwareprojects.com>
Subject Re: skip + limit support in GetSlice
Date Sun, 05 Sep 2010 19:57:18 GMT
Hi Michal,

Did you read the PDF Stu sent over, start to finish?  There are several 
different approaches described there.

With Cassandra, what we found works best for pagination:
* Keep a separate 'total_records' count and increment/decrement it on 
every insert/delete
* When getting slices, pass 'last seen' as the 'from' and keep the 'to' 
empty.  Pass the number of records you want to show per page in the 'count'.
* Avoid letting user skip to page X, using Next/Prev/First/Last only 
(same way GMail does it)


Michal August├Żn wrote:
> I know that "Prev/Next" is good solution for web applications. But 
> when I want to access data from another application or when I want to 
> access pages randomly...
>
> I don't know the internal structure of memtables etc., so I don't know 
> if columns in row are indexable. If now, then I just want to transfer 
> my workaround to server (to avoid huge network traffic)...
>
> 2010/9/5 Stu Hood <stu.hood@rackspace.com <mailto:stu.hood@rackspace.com>>
>
>     Cassandra supports the recommended approach from:
>     http://www.percona.com/ppc2009/PPC2009_mysql_pagination.pdf
>
>     For large numbers of items, skip + limit is extremely inefficent.
>
>     -----Original Message-----
>     From: "Michal August├Żn" <augustyn.michal@gmail.com
>     <mailto:augustyn.michal@gmail.com>>
>     Sent: Sunday, September 5, 2010 5:39am
>     To: user@cassandra.apache.org <mailto:user@cassandra.apache.org>
>     Subject: skip + limit support in GetSlice
>
>     Hello,
>
>     probably this is feature request. Simply, I would like to have
>     support for
>     standard pagination (skip + limit) in GetSlice Thrift method. Is this
>     feature on the road map?
>
>     Now, I have to perform GetSlice call, that starts on "" and
>     "limit" is set
>     to "skip" value. Then I read the last column name returned and
>     subsequently
>     perform the final GetSlice call - I use the last column name as
>     "start" and
>     set "limit" to "limit" value.
>
>     This workaround is not very efficient when I need to skip a lot of
>     columns
>     (so "skip" is high) - then a lot of data must be transferred via
>     network. So
>     I think that support for Skip in GetSlice would be very useful (to
>     avoid
>     high network traffic).
>
>     The implementation could be very straightforward (same as the
>     workaround) or
>     maybe it could be more efficient - I think that whole row (so all
>     columns)
>     must fit into memory so if we have all columns in memory...
>
>     Thank you!
>
>     Augi
>
>
>


Mime
View raw message