incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Iterating through large numbers of rows with JDBC
Date Tue, 14 May 2013 18:39:11 GMT
You can iterate over them, just make sure to set a sensible row count to chunk things up.

You can also break up the processing so only one worker reads the token ranges for a node.
That allows you to 
process the rows in parallel and avoid workers processing the same rows. 

Aaron Morton
Freelance Cassandra Consultant
New Zealand


On 13/05/2013, at 2:51 AM, Robert Wille <> wrote:

> Iterating through lots of records is not a primary use of my data.
> However, there are a number scenarios where scanning the entire contents
> of a column family is an interesting and useful exercise. Here are a few:
> removal of orphaned records, checking the integrity a data set, and
> analytics.
> On 5/12/13 3:41 AM, "Oleg Dulin" <> wrote:
>> On 2013-05-11 14:42:32 +0000, Robert Wille said:
>>> I'm using the JDBC driver to access Cassandra. I'm wondering if its
>>> possible to iterate through a large number of records (e.g. to perform
>>> maintenance on a large column family). I tried calling
>>> Connection.createStatement(ResultSet.TYPE_FORWARD_ONLY,
>>> ResultSet.CONCUR_READ_ONLY), but it times out, so I'm guessing that
>>> cursors aren't supported. Is there another way to do this, or do I need
>>> to
>>> use a different API?
>>> Thanks in advance
>>> Robert
>> If you feel that you need to iterate through a large number of rows
>> then you are probably not using a correct data model.
>> Can you describe your use case ?
>> -- 
>> Regards,
>> Oleg Dulin
>> NYC Java Big Data Engineer

View raw message