Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EF2EB974A for ; Thu, 16 Feb 2012 00:25:27 +0000 (UTC) Received: (qmail 7166 invoked by uid 500); 16 Feb 2012 00:25:27 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 7121 invoked by uid 500); 16 Feb 2012 00:25:27 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 7111 invoked by uid 99); 16 Feb 2012 00:25:27 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Feb 2012 00:25:27 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Feb 2012 00:25:21 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 4DCCB1B94DD for ; Thu, 16 Feb 2012 00:25:00 +0000 (UTC) Date: Thu, 16 Feb 2012 00:25:00 +0000 (UTC) From: "Tsz Wo (Nicholas), SZE (Commented) (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <338999542.43936.1329351900320.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <172468803.31024.1324393771135.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (MAPREDUCE-3583) ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209011#comment-13209011 ] Tsz Wo (Nicholas), SZE commented on MAPREDUCE-3583: --------------------------------------------------- Sorry that I thought BigInteger was used for checking overflow. If the range of stime is expected to be larger than Long.MAX_VALUE, it is okay to use BigInteger for the moment. We may improve it later on. > ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException > ----------------------------------------------------------------------------- > > Key: MAPREDUCE-3583 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 0.20.205.0 > Environment: 64-bit Linux: > asf011.sp2.ygridcore.net > Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux > Reporter: Zhihong Yu > Assignee: Zhihong Yu > Priority: Critical > Attachments: mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, mapreduce-3583-trunk-v4.txt, mapreduce-3583-trunk-v5.txt, mapreduce-3583-trunk-v6.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, mapreduce-3583.txt > > > HBase PreCommit builds frequently gave us NumberFormatException. > From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/: > {code} > 2011-12-20 01:44:01,180 WARN [main] mapred.JobClient(784): No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). > java.lang.NumberFormatException: For input string: "18446743988060683582" > at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) > at java.lang.Long.parseLong(Long.java:422) > at java.lang.Long.parseLong(Long.java:468) > at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) > at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) > at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) > at org.apache.hadoop.mapred.Task.initialize(Task.java:536) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > {code} > From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE: > {code} > // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss) > pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), > {code} > You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console: > {code} > asf011.sp2.ygridcore.net > Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux > core file size (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > scheduling priority (-e) 20 > file size (blocks, -f) unlimited > pending signals (-i) 16382 > max locked memory (kbytes, -l) 64 > max memory size (kbytes, -m) unlimited > open files (-n) 60000 > pipe size (512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time priority (-r) 0 > stack size (kbytes, -s) 8192 > cpu time (seconds, -t) unlimited > max user processes (-u) 2048 > virtual memory (kbytes, -v) unlimited > file locks (-x) unlimited > 60000 > Running in Jenkins mode > {code} > From Nicolas Sze: > {noformat} > It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only works with 63-bit positive integers. In your case, > 2^64 > 18446743988060683582 > 2^63. > Therefore, there is a NFE. > {noformat} > I propose changing allProcessInfo to Map so that we don't encounter this problem by avoiding parsing large integer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira