incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Iterating through large numbers of rows with JDBC
Date Tue, 14 May 2013 18:39:11 GMT
You can iterate over them, just make sure to set a sensible row count to chunk things up.
See http://www.datastax.com/docs/1.2/cql_cli/using/paging#non-ordered-partitioner-paging

You can also break up the processing so only one worker reads the token ranges for a node.
That allows you to 
process the rows in parallel and avoid workers processing the same rows. 

Cheers
 
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/05/2013, at 2:51 AM, Robert Wille <rwille@footnote.com> wrote:

> Iterating through lots of records is not a primary use of my data.
> However, there are a number scenarios where scanning the entire contents
> of a column family is an interesting and useful exercise. Here are a few:
> removal of orphaned records, checking the integrity a data set, and
> analytics.
> 
> On 5/12/13 3:41 AM, "Oleg Dulin" <oleg.dulin@gmail.com> wrote:
> 
>> On 2013-05-11 14:42:32 +0000, Robert Wille said:
>> 
>>> I'm using the JDBC driver to access Cassandra. I'm wondering if its
>>> possible to iterate through a large number of records (e.g. to perform
>>> maintenance on a large column family). I tried calling
>>> Connection.createStatement(ResultSet.TYPE_FORWARD_ONLY,
>>> ResultSet.CONCUR_READ_ONLY), but it times out, so I'm guessing that
>>> cursors aren't supported. Is there another way to do this, or do I need
>>> to
>>> use a different API?
>>> 
>>> Thanks in advance
>>> 
>>> Robert
>> 
>> If you feel that you need to iterate through a large number of rows
>> then you are probably not using a correct data model.
>> 
>> Can you describe your use case ?
>> 
>> -- 
>> Regards,
>> Oleg Dulin
>> NYC Java Big Data Engineer
>> http://www.olegdulin.com/
>> 
>> 
> 
> 


Mime
View raw message