Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2D54B56EC for ; Tue, 10 May 2011 13:51:30 +0000 (UTC) Received: (qmail 92650 invoked by uid 500); 10 May 2011 13:51:30 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 92628 invoked by uid 500); 10 May 2011 13:51:30 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 92620 invoked by uid 99); 10 May 2011 13:51:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 May 2011 13:51:29 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 May 2011 13:51:27 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 7A2A136D38 for ; Tue, 10 May 2011 13:50:47 +0000 (UTC) Date: Tue, 10 May 2011 13:50:47 +0000 (UTC) From: "Devaraj K (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <29875759.171.1305035447497.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1334234840.163.1305035327488.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (MAPREDUCE-2481) SocketTimeoutException is coming in the reduce task when the data size is very high MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031188#comment-13031188 ] Devaraj K commented on MAPREDUCE-2481: -------------------------------------- It has already read many other fields and so there is no chance of a network disconnect, because always problem is coming while reading same field (isMap property in TaskCompletionEvent object). {code:title=TaskCompletionEvent.java|borderStyle=solid} public void readFields(DataInput in) throws IOException { taskId.readFields(in); idWithinJob = WritableUtils.readVInt(in); isMap = in.readBoolean(); {code} > SocketTimeoutException is coming in the reduce task when the data size is very high > ----------------------------------------------------------------------------------- > > Key: MAPREDUCE-2481 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2481 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task > Affects Versions: 0.20.2 > Reporter: Devaraj K > > SocketTimeoutException is coming when reduce task tries to read MapTaskCompletionEventsUpdate object from task tracker, it is able to read reset, TaskCompletionEvent.taskId, TaskCompletionEvent.idWithinJob properties and it is failing for reading the property isMap in TaskCompletionEvent which is of type boolean. This exception is coming multiple times. > {code} > 2011-04-20 15:58:03,037 FATAL mapred.TaskTracker (TaskTracker.java:fatalError(2812)) - Task: attempt_201104201115_0010_r_000002_0 - Killed : java.io.IOException: Tried for the max ping retries On TimeOut :1 > at org.apache.hadoop.ipc.Client.checkPingRetries(Client.java:1342) > at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:402) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at java.io.BufferedInputStream.read(BufferedInputStream.java:237) > at java.io.DataInputStream.readBoolean(DataInputStream.java:225) > at org.apache.hadoop.mapred.TaskCompletionEvent.readFields(TaskCompletionEvent.java:230) > at org.apache.hadoop.mapred.MapTaskCompletionEventsUpdate.readFields(MapTaskCompletionEventsUpdate.java:64) > at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:245) > at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:69) > at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:698) > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:593) > Caused by: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/127.0.0.1:45798 remote=/127.0.0.1:35419] > at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:165) > at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) > at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) > at java.io.FilterInputStream.read(FilterInputStream.java:116) > at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:397) > ... 9 more > {code} > org.mortbay.jetty.EofException is also coming many times in the logs as described in MAPREDUCE-5. > {code} > 2011-04-20 15:57:20,748 WARN mapred.TaskTracker (TaskTracker.java:doGet(3164)) - getMapOutput(attempt_201104201115_0010_m_000038_0,4) failed : > org.mortbay.jetty.EofException > at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:787) > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira