incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: loading all rows from cassandra using multiple (python) clients in parallel
Date Wed, 24 Apr 2013 02:50:55 GMT
> 
>  EDIT: works after switching to testing against the lastest version of the cassandra
database (doh!), and also updating the syntax per notes below:
http://stackoverflow.com/questions/16137944/loading-all-rows-from-cassandra-using-multiple-python-clients-in-parallel

Is this still a problem?

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 23/04/2013, at 12:15 AM, John R. Frank <jrf@mit.edu> wrote:

> Cassandra Experts,
> 
> I understand that when using Cassandra's recommended RandomPartitioner (or Murmur3Partitioner),
it is not possible to do meaningful range queries on keys, because the rows are distributed
around the cluster using the md5 hash of the key.  These hashes are called "tokens."
> 
> Nonetheless, it would be very useful to split up a large table amongst many compute workers
by assigning each a range of tokens.  Using CQL3, it appears possible to issue queries directly
against the tokens, however the following python does not work:
> 
> http://stackoverflow.com/questions/16137944/loading-all-rows-from-cassandra-using-multiple-python-clients-in-parallel
> 
> I would ideally like to make this work with pycassa, because I prefer its more pythonic
interface.
> 
> Am I just not invoking CQL3 correctly through the cql package?
> 
> Is there a better way to do this?
> 
> 
> Thanks for any pointers!
> 
> John
> 
> 
> 
> 


Mime
View raw message