incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrik Modesto <patrik.mode...@gmail.com>
Subject Re: newer Cassandra + Hadoop = TimedOutException()
Date Tue, 06 Mar 2012 12:03:53 GMT
I've tryied cassandra-all 0.8.10 with fixed the rpc_endpoints ==
"0.0.0.0" bug, but the result is the same, there are still tasks over
1000%.  The only change is that there are real host names instead of
0.0.0.0 in the debug output.

Reconfiguring whole cluster is not possible, I can't test the
"rpc_address" commented out.

Regards,
P.


On Tue, Mar 6, 2012 at 12:26, Florent Lefillâtre <flefilla@gmail.com> wrote:
> I remember a bug on the ColumnFamilyInputFormat class 0.8.10.
> It was a test rpc_endpoints == "0.0.0.0" in place of
> rpc_endpoint.equals("0.0.0.0"), may be it can help you
>
> Le 6 mars 2012 12:18, Florent Lefillâtre <flefilla@gmail.com> a écrit :
>
>> Excuse me, I had not understood.
>> So, for me, the problem comes from the change of ColumnFamilyInputFormat
>> class between 0.8.7 and 0.8.10 where the splits are created (0.8.7 uses
>> endpoints and 0.8.10 uses rpc_endpoints).
>> With your config, splits fails, so Hadoop doesn't run a Map task on
>> approximtively 16384 rows (your cassandra.input.split.size) but on all the
>> rows of a node (certainly more over 16384).
>> However Hadoop estimate the task progress on 16384 inputs, it's why you
>> have something like 9076.81%.
>>
>> If you can't change rpc_adress configuration, I don't know how you can
>> solve your problem :/, sorry.
>>
>> Le 6 mars 2012 11:53, Patrik Modesto <patrik.modesto@gmail.com> a écrit :
>>
>>> Hi Florent,
>>>
>>> I don't change the server version, it is the Cassandra 0.8.10. I
>>> change just the version of cassandra-all in pom.xml of the mapreduce
>>> job.
>>>
>>> I have the 'rpc_address: 0.0.0.0'  in cassandra.yaml, because I want
>>> cassandra to bind RPC to all interfaces.
>>>
>>> Regards,
>>> P.
>>>
>>> On Tue, Mar 6, 2012 at 09:44, Florent Lefillâtre <flefilla@gmail.com>
>>> wrote:
>>> > Hi, I had the same problem on hadoop 0.20.2 and cassandra 1.0.5.
>>> > In my case the split of token range failed.
>>> > I have comment line 'rpc_address: 0.0.0.0' in cassandra.yaml.
>>> > May be see if you have not configuration changes between 0.8.7 and
>>> > 0.8.10
>>> >
>>> >
>>> > Le 6 mars 2012 09:32, Patrik Modesto <patrik.modesto@gmail.com> a
écrit
>>> > :
>>> >
>>> >> Hi,
>>> >>
>>> >> I was recently trying Hadoop job + cassandra-all 0.8.10 again and the
>>> >> Timeouts I get are not because of the Cassandra can't handle the
>>> >> requests. I've noticed there are several tasks that show proggess of
>>> >> several thousands percents. Seems like they are looping their range
of
>>> >> keys. I've run the job with debug enabled and the ranges look ok, see
>>> >> http://pastebin.com/stVsFzLM
>>> >>
>>> >> Another difference between cassandra-all 0.8.7 and 0.8.10 is the
>>> >> number of mappers the job creates:
>>> >> 0.8.7: 4680
>>> >> 0.8.10: 595
>>> >>
>>> >> Task       Complete
>>> >> task_201202281457_2027_m_000041 9076.81%
>>> >> task_201202281457_2027_m_000073 9639.04%
>>> >> task_201202281457_2027_m_000105 10538.60%
>>> >> task_201202281457_2027_m_000108 9364.17%
>>> >>
>>> >> None of this happens with cassandra-all 0.8.7.
>>> >>
>>> >> Regards,
>>> >> P.
>>> >>
>>> >>
>>> >>
>>> >> On Tue, Feb 28, 2012 at 12:29, Patrik Modesto
>>> >> <patrik.modesto@gmail.com>
>>> >> wrote:
>>> >> > I'll alter these settings and will let you know.
>>> >> >
>>> >> > Regards,
>>> >> > P.
>>> >> >
>>> >> > On Tue, Feb 28, 2012 at 09:23, aaron morton
>>> >> > <aaron@thelastpickle.com>
>>> >> > wrote:
>>> >> >> Have you tried lowering the  batch size and increasing the
time
>>> >> >> out?
>>> >> >> Even
>>> >> >> just to get it to work.
>>> >> >>
>>> >> >> If you get a TimedOutException it means CL number of servers
did
>>> >> >> not
>>> >> >> respond
>>> >> >> in time.
>>> >> >>
>>> >> >> Cheers
>>> >> >>
>>> >> >> -----------------
>>> >> >> Aaron Morton
>>> >> >> Freelance Developer
>>> >> >> @aaronmorton
>>> >> >> http://www.thelastpickle.com
>>> >> >>
>>> >> >> On 28/02/2012, at 8:18 PM, Patrik Modesto wrote:
>>> >> >>
>>> >> >> Hi aaron,
>>> >> >>
>>> >> >> this is our current settings:
>>> >> >>
>>> >> >>      <property>
>>> >> >>          <name>cassandra.range.batch.size</name>
>>> >> >>          <value>1024</value>
>>> >> >>      </property>
>>> >> >>
>>> >> >>      <property>
>>> >> >>          <name>cassandra.input.split.size</name>
>>> >> >>          <value>16384</value>
>>> >> >>      </property>
>>> >> >>
>>> >> >> rpc_timeout_in_ms: 30000
>>> >> >>
>>> >> >> Regards,
>>> >> >> P.
>>> >> >>
>>> >> >> On Mon, Feb 27, 2012 at 21:54, aaron morton
>>> >> >> <aaron@thelastpickle.com>
>>> >> >> wrote:
>>> >> >>
>>> >> >> What settings do you have for cassandra.range.batch.size
>>> >> >>
>>> >> >> and rpc_timeout_in_ms  ? Have you tried reducing the first
and/or
>>> >> >> increasing
>>> >> >>
>>> >> >> the second ?
>>> >> >>
>>> >> >>
>>> >> >> Cheers
>>> >> >>
>>> >> >>
>>> >> >> -----------------
>>> >> >>
>>> >> >> Aaron Morton
>>> >> >>
>>> >> >> Freelance Developer
>>> >> >>
>>> >> >> @aaronmorton
>>> >> >>
>>> >> >> http://www.thelastpickle.com
>>> >> >>
>>> >> >>
>>> >> >> On 27/02/2012, at 8:02 PM, Patrik Modesto wrote:
>>> >> >>
>>> >> >>
>>> >> >> On Sun, Feb 26, 2012 at 04:25, Edward Capriolo
>>> >> >> <edlinuxguru@gmail.com>
>>> >> >>
>>> >> >> wrote:
>>> >> >>
>>> >> >>
>>> >> >> Did you see the notes here?
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> I'm not sure what do you mean by the notes?
>>> >> >>
>>> >> >>
>>> >> >> I'm using the mapred.* settings suggested there:
>>> >> >>
>>> >> >>
>>> >> >>     <property>
>>> >> >>
>>> >> >>         <name>mapred.max.tracker.failures</name>
>>> >> >>
>>> >> >>         <value>20</value>
>>> >> >>
>>> >> >>     </property>
>>> >> >>
>>> >> >>     <property>
>>> >> >>
>>> >> >>         <name>mapred.map.max.attempts</name>
>>> >> >>
>>> >> >>         <value>20</value>
>>> >> >>
>>> >> >>     </property>
>>> >> >>
>>> >> >>     <property>
>>> >> >>
>>> >> >>         <name>mapred.reduce.max.attempts</name>
>>> >> >>
>>> >> >>         <value>20</value>
>>> >> >>
>>> >> >>     </property>
>>> >> >>
>>> >> >>
>>> >> >> But I still see the timeouts that I haven't with cassandra-all
>>> >> >> 0.8.7.
>>> >> >>
>>> >> >>
>>> >> >> P.
>>> >> >>
>>> >> >>
>>> >> >> http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >
>>> >
>>
>>
>

Mime
View raw message