cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Burton <bur...@spinn3r.com>
Subject Re: paging through an entire table in chunks?
Date Sat, 27 Sep 2014 22:57:46 GMT
Agreed… but I’d like to parallelize it… Eventually I’ll just have too much
data to do it on one server… plus, I need suspend/resume and this way if
I’m doing like 10MB at a time I’ll be able to suspend / resume as well as
track progress.

On Sat, Sep 27, 2014 at 2:52 PM, DuyHai Doan <doanduyhai@gmail.com> wrote:

> Use the java driver and paging feature:
> http://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/Statement.html#setFetchSize(int)
>
> 1) Do you "SELECT * FROM" without any selection
> 2) Set fetchSize to a sensitive value
> 3) Execute the query and get an iterator from the ResultSet
> 4) Iterate
>
>
>
> On Sat, Sep 27, 2014 at 11:42 PM, Kevin Burton <burton@spinn3r.com> wrote:
>
>> I need a way to do a full table scan across all of our data.
>>
>> Can’t I just use token() for this?
>>
>> This way I could split up our entire keyspace into say 1024 chunks, and
>> then have one activemq task work with range 0, then range 1, etc… that way
>> I can easily just map() my whole table.
>>
>> and since it’s token() I should (generally) read a contiguous range from
>> a given table.
>>
>> --
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>> <http://spinn3r.com>
>>
>>
>


-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Mime
View raw message