cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Roth <benjamin.r...@jaumo.com>
Subject Re: parallel processing - splitting data
Date Thu, 19 Jan 2017 12:31:19 GMT
I meant the global whole token range which is -(2^64/2) to ((2^64) / 2 - 1)
I remember there are classes that already generate the right slices but
don't know by heart which one it was.

2017-01-19 13:29 GMT+01:00 Frank Hughes <frankhughes782@gmail.com>:

> I have tried to retrieve the token range and slice in 4, but the response
> i get for the following code is different on each node:
>
> TokenRange[] tokenRanges = unwrapTokenRanges(metadata.getTokenRanges(keyspaceName,
> localHost)).toArray(new TokenRange[0]);
>
> On each node, the 1024 token ranges are different, so Im not sure how to
> do the split.
>
> e.g. from node 1
>
> Token ranges - start:-5144720537407094184 end:-5129226025397315327
>
> This token range isn't returned by node 2, 3 or 4.
>
> Thanks again
>
> Frank
>
> On 19 January 2017 at 12:19, Benjamin Roth <benjamin.roth@jaumo.com>
> wrote:
>
>> If you have 4 Nodes with RF 4 then all data is on every node. So you can
>> just slice the whole token range into 4 pieces and let each node process 1
>> slice.
>> Determining local ranges also only helps if you read with CL_ONE.
>>
>> 2017-01-19 13:05 GMT+01:00 Frank Hughes <frankhughes782@gmail.com>:
>>
>>> Hello there,
>>>
>>> I'm running a 4 node cluster of Cassandra 3.9 with a replication factor
>>> of 4.
>>>
>>> I want to be able to run a java process on each node only selecting a
>>> 25% of the data on each node,
>>> so i can process all of the data in parallel on each node.
>>>
>>> What is the best way to do this with the java driver ?
>>>
>>> I was assuming I could retrieve the token ranges for each node and page
>>> through the data using these ranges, but this includes the replicated data.
>>> I was hoping there was away of only selecting the data that a node is
>>> responsible for and avoiding the replicated data.
>>>
>>> Many thanks for any help and guidance,
>>>
>>> Frank Hughes
>>>
>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>> <+49%207161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer

Mime
View raw message