Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EB4D99AEB for ; Wed, 29 Feb 2012 05:48:24 +0000 (UTC) Received: (qmail 23554 invoked by uid 500); 29 Feb 2012 05:48:24 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 23382 invoked by uid 500); 29 Feb 2012 05:48:23 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 23329 invoked by uid 99); 29 Feb 2012 05:48:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Feb 2012 05:48:22 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Feb 2012 05:48:19 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 5CF5854DAE for ; Wed, 29 Feb 2012 05:47:58 +0000 (UTC) Date: Wed, 29 Feb 2012 05:47:58 +0000 (UTC) From: "Hudson (Commented) (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <479041115.1555.1330494478382.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1868269021.25190.1330381909043.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (MAPREDUCE-3931) MR tasks failing due to changing timestamps on Resources to download MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218931#comment-13218931 ] Hudson commented on MAPREDUCE-3931: ----------------------------------- Integrated in Hadoop-Mapreduce-trunk-Commit #1810 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1810/]) MAPREDUCE-3931. Changed PB implementation of LocalResource to take locks so that race conditions don't fail tasks by inadvertantly changing the timestamps. Contributed by Siddarth Seth. (Revision 1294970) Result = ABORTED vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1294970 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/LocalResourcePBImpl.java > MR tasks failing due to changing timestamps on Resources to download > -------------------------------------------------------------------- > > Key: MAPREDUCE-3931 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3931 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 > Affects Versions: 0.23.0 > Reporter: Vinod Kumar Vavilapalli > Assignee: Siddharth Seth > Fix For: 0.23.2 > > Attachments: MR3931.txt > > > [~karams] reported this offline. Seems that tasks are randomly failing during gridmix runs: > {code} > 2012-02-24 21:03:34,912 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1330116323296_0140_m_003868_0: RemoteTrace: > java.io.IOException: Resource hdfs://hostname.com:8020/user/hadoop15/.staging/job_1330116323296_0140/job.jar changed on src filesystem (expected 2971811411, was 1330116705875 > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:90) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:49) > at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:157) > at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:155) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:153) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:619) > at LocalTrace: > org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Resource hdfs://hostname.com:8020/user/hadoop15/.staging/job_1330116323296_0140/job.jar changed on src filesystem (expected 2971811411, was 1330116705875 > at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217) > at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147) > at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:827) > at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:497) > at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:222) > at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46) > at org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57) > at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:342) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1493) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1489) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1487) > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira