incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dop Sun" <>
Subject RE: keyrange for get_range_slices
Date Thu, 10 Jun 2010 20:56:41 GMT
Thanks for your quick and detailed explain on the key scan. This is really




From: Philip Stanhope [] 
Sent: Thursday, June 10, 2010 10:40 PM
Subject: Re: keyrange for get_range_slices


No ... and I personally don't have a problem with this if you think about
what is actually going on under the covers.


Note, however, that this is an expensive operation and as a result if there
are parallel updates to the indexes while you are performing a full keyscan
(rowscan) you will potentially miss keys because they are inserted earlier
in the index than you are currently processing.


A further concern is that the keys (and indexes) are spread around a
cluster. Unless R=N you will be hitting the network during this type of


Lastly, be careful about how you specify the SlicePredicate. A keyscan can
easily turn into a "dump the entire datastore" if you aren't careful.


On Jun 10, 2010, at 10:03 AM, Dop Sun wrote:



As documented in the, the key range for
get_range_slices are both inclusive.


As discussed in this thread:
de067d3, there is a case that user want to discover all keys (huge number)
in a column family.


What I think  is doing batchly: using empty string as start and finish
first, then using the last key returned as start and query second.


My question is: using this method, the last key returned for the first
query, will be returned again in the second query as the first key. And it's
a duplication. Is there any other API to discover keys without duplications
in current implementation?






View raw message