incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Hanna <jeremy.hanna1...@gmail.com>
Subject Re: newer Cassandra + Hadoop = TimedOutException()
Date Tue, 06 Mar 2012 08:49:22 GMT
you may be running into this - https://issues.apache.org/jira/browse/CASSANDRA-3942 - I'm not
sure if it really affects the execution of the job itself though.

On Mar 6, 2012, at 2:32 AM, Patrik Modesto wrote:

> Hi,
> 
> I was recently trying Hadoop job + cassandra-all 0.8.10 again and the
> Timeouts I get are not because of the Cassandra can't handle the
> requests. I've noticed there are several tasks that show proggess of
> several thousands percents. Seems like they are looping their range of
> keys. I've run the job with debug enabled and the ranges look ok, see
> http://pastebin.com/stVsFzLM
> 
> Another difference between cassandra-all 0.8.7 and 0.8.10 is the
> number of mappers the job creates:
> 0.8.7: 4680
> 0.8.10: 595
> 
> Task       Complete
> task_201202281457_2027_m_000041	9076.81%
> task_201202281457_2027_m_000073	9639.04%
> task_201202281457_2027_m_000105	10538.60%
> task_201202281457_2027_m_000108	9364.17%
> 
> None of this happens with cassandra-all 0.8.7.
> 
> Regards,
> P.
> 
> 
> 
> On Tue, Feb 28, 2012 at 12:29, Patrik Modesto <patrik.modesto@gmail.com> wrote:
>> I'll alter these settings and will let you know.
>> 
>> Regards,
>> P.
>> 
>> On Tue, Feb 28, 2012 at 09:23, aaron morton <aaron@thelastpickle.com> wrote:
>>> Have you tried lowering the  batch size and increasing the time out? Even
>>> just to get it to work.
>>> 
>>> If you get a TimedOutException it means CL number of servers did not respond
>>> in time.
>>> 
>>> Cheers
>>> 
>>> -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 28/02/2012, at 8:18 PM, Patrik Modesto wrote:
>>> 
>>> Hi aaron,
>>> 
>>> this is our current settings:
>>> 
>>>      <property>
>>>          <name>cassandra.range.batch.size</name>
>>>          <value>1024</value>
>>>      </property>
>>> 
>>>      <property>
>>>          <name>cassandra.input.split.size</name>
>>>          <value>16384</value>
>>>      </property>
>>> 
>>> rpc_timeout_in_ms: 30000
>>> 
>>> Regards,
>>> P.
>>> 
>>> On Mon, Feb 27, 2012 at 21:54, aaron morton <aaron@thelastpickle.com> wrote:
>>> 
>>> What settings do you have for cassandra.range.batch.size
>>> 
>>> and rpc_timeout_in_ms  ? Have you tried reducing the first and/or increasing
>>> 
>>> the second ?
>>> 
>>> 
>>> Cheers
>>> 
>>> 
>>> -----------------
>>> 
>>> Aaron Morton
>>> 
>>> Freelance Developer
>>> 
>>> @aaronmorton
>>> 
>>> http://www.thelastpickle.com
>>> 
>>> 
>>> On 27/02/2012, at 8:02 PM, Patrik Modesto wrote:
>>> 
>>> 
>>> On Sun, Feb 26, 2012 at 04:25, Edward Capriolo <edlinuxguru@gmail.com>
>>> 
>>> wrote:
>>> 
>>> 
>>> Did you see the notes here?
>>> 
>>> 
>>> 
>>> I'm not sure what do you mean by the notes?
>>> 
>>> 
>>> I'm using the mapred.* settings suggested there:
>>> 
>>> 
>>>     <property>
>>> 
>>>         <name>mapred.max.tracker.failures</name>
>>> 
>>>         <value>20</value>
>>> 
>>>     </property>
>>> 
>>>     <property>
>>> 
>>>         <name>mapred.map.max.attempts</name>
>>> 
>>>         <value>20</value>
>>> 
>>>     </property>
>>> 
>>>     <property>
>>> 
>>>         <name>mapred.reduce.max.attempts</name>
>>> 
>>>         <value>20</value>
>>> 
>>>     </property>
>>> 
>>> 
>>> But I still see the timeouts that I haven't with cassandra-all 0.8.7.
>>> 
>>> 
>>> P.
>>> 
>>> 
>>> http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting
>>> 
>>> 
>>> 
>>> 


Mime
View raw message