hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arko Provo Mukherjee <arkoprovomukher...@gmail.com>
Subject Re: Mappers getting killed
Date Mon, 31 Oct 2011 23:41:02 GMT
Hi,

I used the setStatus method and now my mappers are not getting killed
anymore.

Thanks a lot!

Warm regards
Arko

On Thu, Oct 27, 2011 at 4:31 AM, Lucian Iordache <
lucian.george.iordache@gmail.com> wrote:

> Hi,
>
> Probably your map method takes too long to process the data. You could add
> some context.progress() or context.setStatus("status") in your map method
> from time to time (at least once every 600 seconds, to not get the timeout).
>
> Regards,
> Lucian
>
>
> On Thu, Oct 27, 2011 at 11:22 AM, Arko Provo Mukherjee <
> arkoprovomukherjee@gmail.com> wrote:
>
>> Hi,
>>
>> I have a situation where I have to read a large file into every mapper.
>>
>> Since its a large HDFS file that is needed to work on each input to the
>> mapper, it is taking a lot of time to read the data into the memory from
>> HDFS.
>>
>> Thus the system is killing all my Mappers with the following message:
>>
>> 11/10/26 22:54:52 INFO mapred.JobClient: Task Id :
>> attempt_201106271322_12504_m_000000_0, Status : FAILED
>> Task attempt_201106271322_12504_m_000000_0 failed to report status for
>> 601 seconds. Killing!
>>
>> The cluster is not entirely owned by me and hence I cannot change the *
>> mapred.task.timeout* so as to be able to read the entire file.
>>
>> Any suggestions?
>>
>> Also, is there a way such that a Mapper instance reads the file once for
>> all the inputs that it receives.
>> Currently, since the file reading code is in the map method, I guess its
>> reading the entire file for each and every input leading to a lot of
>> overhead.
>>
>> Please help!
>>
>> Many thanks in advance!!
>>
>> Warm regards
>> Arko
>>
>
>
>
> --
> Numai bine,
> Lucian
>

Mime
View raw message