incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McNelis <dmcne...@gmail.com>
Subject Re: Iterating through large numbers of rows with JDBC
Date Tue, 14 May 2013 18:42:27 GMT
Another thing to keep in mind when doing this with CQL is to take into
account the ordering partitioner you may or may not be using.  If you're
using one you'll need to make sure that if you have a larger number of rows
for the partitioner key than your query limit, then you can end up in a
situation where you're stuck in a loop.


On Tue, May 14, 2013 at 1:39 PM, aaron morton <aaron@thelastpickle.com>wrote:

> You can iterate over them, just make sure to set a sensible row count to
> chunk things up.
> See
> http://www.datastax.com/docs/1.2/cql_cli/using/paging#non-ordered-partitioner-paging
>
> You can also break up the processing so only one worker reads the token
> ranges for a node. That allows you to
> process the rows in parallel and avoid workers processing the same rows.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 13/05/2013, at 2:51 AM, Robert Wille <rwille@footnote.com> wrote:
>
> Iterating through lots of records is not a primary use of my data.
> However, there are a number scenarios where scanning the entire contents
> of a column family is an interesting and useful exercise. Here are a few:
> removal of orphaned records, checking the integrity a data set, and
> analytics.
>
> On 5/12/13 3:41 AM, "Oleg Dulin" <oleg.dulin@gmail.com> wrote:
>
> On 2013-05-11 14:42:32 +0000, Robert Wille said:
>
> I'm using the JDBC driver to access Cassandra. I'm wondering if its
> possible to iterate through a large number of records (e.g. to perform
> maintenance on a large column family). I tried calling
> Connection.createStatement(ResultSet.TYPE_FORWARD_ONLY,
> ResultSet.CONCUR_READ_ONLY), but it times out, so I'm guessing that
> cursors aren't supported. Is there another way to do this, or do I need
> to
> use a different API?
>
> Thanks in advance
>
> Robert
>
>
> If you feel that you need to iterate through a large number of rows
> then you are probably not using a correct data model.
>
> Can you describe your use case ?
>
> --
> Regards,
> Oleg Dulin
> NYC Java Big Data Engineer
> http://www.olegdulin.com/
>
>
>
>
>
>

Mime
View raw message