cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Effect of RangeQuery with RandomPartitioner
Date Sat, 07 Jul 2012 16:17:06 GMT
On Sat, Jul 7, 2012 at 11:17 AM, prasenjit mukherjee
<prasen.bea@gmail.com> wrote:
> Have 2 questions :
>
> 1. In RP on a given node, are the rows ordered by hash(key) or key ?
> If the rows on a node are ordered by hash(key) then essentially it has
> to be implemented by a full-scan on that node.
>
> 2. In RP, How does a cassandra node route a client's range-query
> request ? The range is distributed across the ring, so essentially
> either it send has to send the request to all nodes in the ring or
> just do a local processing.
>
> On Sat, Jul 7, 2012 at 7:47 PM, Edward Capriolo <edlinuxguru@gmail.com> wrote:
>> On Sat, Jul 7, 2012 at 9:26 AM, prasenjit mukherjee
>> <prasen.bea@gmail.com> wrote:
>>> Wondering how a rangequery request is handled if RP is used.  Will the
>>> receiving node do a fan-out to all the nodes in the ring or it will
>>> just execute the rangequery on its own local partition ?
>>>
>>> -Prasenjit
>>
>> With RP the data is still ordered. It is ordered pseudo randomly. Like
>> all ranging scanning you can start with the null start row key for
>> your first range scan. Then for the next range scan use the last row
>> key from your results from the first scan.
1)
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/dht/RandomPartitioner.java?view=markup

http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/dht/AbstractPartitioner.java?revision=1208993&view=markup

2) A single range slice is not handled by all nodes in the cluster.
The request is routed to one or more of the natural endpoints for the
range. An exception would be a range slice that crosses a token
boundary of a node.

Random Partitioner is not actually random the data is ordered by the
hash of the key. Thus data is in predictable location and repeated
range scans return the same order. However because md5 generates
drastically different hashes for similar keys like data will not clump
together.

To put it another way, if you have a 10 node cluster with RP and you
with to range scan the entire dataset, 0 - >2^128 (or whatever that
big number is) you will notice that the range scans first make three
of the nodes busy, then a forth node starts taking requests as the
first nodes starts getting less requests, finally the first node gets
no more requests and so on.

Another option is that row keys can now be composite and cassandra
will use the first part of the composite to locate the node and the
second part of the composite to order the data. Sweet!

Mime
View raw message