incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Kjellman <mkjell...@barracuda.com>
Subject Re: hadoop consistency level
Date Thu, 18 Oct 2012 20:34:24 GMT
Not sure I understand your question (if there is one..)

You are more than welcome to do CL ONE and assuming you have hadoop nodes
in the right places on your ring things could work out very nicely. If you
need to guarantee that you have all the data in your job then you'll need
to use QUORUM.

If you don't specify a CL in your job config it will default to ONE (at
least that's what my read of the ConfigHelper source for 1.1.6 shows)

On 10/18/12 1:29 PM, "Andrey Ilinykh" <ailinykh@gmail.com> wrote:

>On Thu, Oct 18, 2012 at 1:24 PM, Michael Kjellman
><mkjellman@barracuda.com> wrote:
>> Well there is *some* data locality, it's just not guaranteed. My
>> understanding (and someone correct me if I'm wrong) is that
>> ColumnFamilyInputFormat implements InputSplit and the getLocations()
>> method.
>>
>> 
>>http://hadoop.apache.org/docs/mapreduce/current/api/org/apache/hadoop/map
>>re
>> duce/InputSplit.html
>>
>> ColumnFamilySplit.java contains logic to do it's best to determine what
>> node that particular hadoop node contains the data for that mapper.
>>
>But no guarantee local data is in sync with other nodes. Which means
>you have CL ONE. If you want CL QUORUM you have to make remote call,
>no matter if data is local or not.


'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks
solutions.
Visit http://barracudanetworks.com/facebook



Mime
View raw message