Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3F0429D84 for ; Thu, 27 Oct 2011 08:22:59 +0000 (UTC) Received: (qmail 95290 invoked by uid 500); 27 Oct 2011 08:22:58 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 95257 invoked by uid 500); 27 Oct 2011 08:22:57 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 95246 invoked by uid 99); 27 Oct 2011 08:22:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Oct 2011 08:22:54 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of arkoprovomukherjee@gmail.com designates 209.85.212.48 as permitted sender) Received: from [209.85.212.48] (HELO mail-vw0-f48.google.com) (209.85.212.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Oct 2011 08:22:47 +0000 Received: by vws7 with SMTP id 7so3063620vws.35 for ; Thu, 27 Oct 2011 01:22:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=F6LYaewCWd1xjvgk6yDhapPhZ+jkrXDRmnJSOLYEfXs=; b=ZXs1aDBNMuCuQOjkVM8T+8fzFzRopcluvSUlfv3YP/XNDER/kNJneexMEbjdTCBP0X GebYjfZVhm2EUglzD1Fc0YYxGKVha3Pp10AWb9by0fb8Ipq9nA+WnqfMsA07IxRiW0dn 7tuewyq9b/mjFgphpBq3hEV51XOpbO+WX368k= MIME-Version: 1.0 Received: by 10.52.35.70 with SMTP id f6mr732515vdj.84.1319703746710; Thu, 27 Oct 2011 01:22:26 -0700 (PDT) Received: by 10.52.167.234 with HTTP; Thu, 27 Oct 2011 01:22:26 -0700 (PDT) Date: Thu, 27 Oct 2011 03:22:26 -0500 Message-ID: Subject: Mappers getting killed From: Arko Provo Mukherjee To: mapreduce-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf307d000a080d3b04b04379fb --20cf307d000a080d3b04b04379fb Content-Type: text/plain; charset=ISO-8859-1 Hi, I have a situation where I have to read a large file into every mapper. Since its a large HDFS file that is needed to work on each input to the mapper, it is taking a lot of time to read the data into the memory from HDFS. Thus the system is killing all my Mappers with the following message: 11/10/26 22:54:52 INFO mapred.JobClient: Task Id : attempt_201106271322_12504_m_000000_0, Status : FAILED Task attempt_201106271322_12504_m_000000_0 failed to report status for 601 seconds. Killing! The cluster is not entirely owned by me and hence I cannot change the * mapred.task.timeout* so as to be able to read the entire file. Any suggestions? Also, is there a way such that a Mapper instance reads the file once for all the inputs that it receives. Currently, since the file reading code is in the map method, I guess its reading the entire file for each and every input leading to a lot of overhead. Please help! Many thanks in advance!! Warm regards Arko --20cf307d000a080d3b04b04379fb Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi,

I have a situation where I have to read a large file into every = mapper.=A0

Since its a large HDFS file that is needed to work on eac= h input to the mapper, it is taking a lot of time to read the data into the= memory from HDFS.

Thus the system is killing all my Mappers with the following message:
11/10/26 22:54:52 = INFO mapred.JobClient: Task Id : attempt_201106271322_12504_m_000000_0, Sta= tus : FAILED
Task attempt_201106271322_12504_m_000000_0 failed to report status for 601 = seconds. Killing!


The cluster is not entirely owned by me and= hence I cannot change the=A0mapred.task.timeout=A0so as to be able = to read the entire file.

Any suggestions?

Also, is there a w= ay such that a Mapper instance reads the file once for all the inputs that = it receives.=A0
Currently, since the file reading code is in the = map method, I guess its reading the entire file for each and every input le= ading to a lot of overhead.

Please help!

Many thanks in advance!!=

Warm regards
Arko
--20cf307d000a080d3b04b04379fb--