hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kihwal Lee <kih...@yahoo-inc.com>
Subject Re: division by zero in getLocalPathForWrite()
Date Tue, 30 Oct 2012 16:29:04 GMT
Ted,

I couldn't reproduce it by just running the test case. When you reproduce
it, look at the stderr/stdout file somewhere under
target/org.apache.hadoop.mapred.MiniMRCluster. Look for the one under the
directory whose name containing the app id.

I did run into a similar problem and the stderr said:
/bin/bash: /bin/java: No such file or directory

It was because JAVA_HOME was not set. But in this case the exit code was
127 (shell not being able to locate the command to exec). In the hudson
job, the exit code was 1, so I think it's something else.

Kihwal

On 10/29/12 11:56 PM, "Ted Yu" <yuzhihong@gmail.com> wrote:

>TestRowCounter still fails:
>https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/244/testReport/j
>unit/org.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterNoColu
>mn/
>
>but there was no 'divide by zero' exception.
>
>Cheers
>
>On Thu, Oct 25, 2012 at 8:04 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> I will try 2.0.2-alpha release.
>>
>> Cheers
>>
>>
>> On Thu, Oct 25, 2012 at 7:54 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>>
>>> Thanks for the quick response, Robert.
>>> Here is the hadoop version being used:
>>>     <hadoop-two.version>2.0.1-alpha</hadoop-two.version>
>>>
>>> If there is newer release, I am willing to try that before filing JIRA.
>>>
>>>
>>> On Thu, Oct 25, 2012 at 7:07 AM, Robert Evans
>>><evans@yahoo-inc.com>wrote:
>>>
>>>> It looks like you are running with an older version of 2.0, even
>>>>though
>>>> it
>>>> does not really make much of a difference in this case,  The issue
>>>>shows
>>>> up when getLocalPathForWrite thinks there is no space on to write to
>>>>on
>>>> any of the disks it has configured.  This could be because you do not
>>>> have
>>>> any directories configured.  I really don't know for sure exactly
>>>>what is
>>>> happening.  It might be disk fail in place removing disks for you
>>>>because
>>>> of other issues. Either way we should file a JIRA against Hadoop to
>>>>make
>>>> it so we never get the / by zero error and provide a better way to
>>>>handle
>>>> the possible causes.
>>>>
>>>> --Bobby Evans
>>>>
>>>> On 10/24/12 11:54 PM, "Ted Yu" <yuzhihong@gmail.com> wrote:
>>>>
>>>> >Hi,
>>>> >HBase has Jenkins build against hadoop 2.0
>>>> >I was checking why TestRowCounter sometimes failed:
>>>> >
>>>> 
>>>>https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/231/testRepor
>>>>t/o
>>>>
>>>> 
>>>>>rg.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterExclusiv
>>>>>eCol
>>>> >umn/
>>>> >
>>>> >I think the following could be the cause:
>>>> >
>>>> >2012-10-22 23:46:32,571 WARN  [AsyncDispatcher event handler]
>>>> >resourcemanager.RMAuditLogger(255): USER=jenkins
>>>> OPERATION=Application
>>>> >Finished - Failed      TARGET=RMAppManager     RESULT=FAILURE
>>>>  DESCRIPTION=App
>>>> >failed with state: FAILED      PERMISSIONS=Application
>>>> >application_1350949562159_0002 failed 1 times due to AM Container for
>>>> >appattempt_1350949562159_0002_000001 exited with  exitCode: -1000 due
>>>> >to: java.lang.ArithmeticException: / by zero
>>>> >       at
>>>>
>>>> 
>>>>>org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPat
>>>>>hFor
>>>> >Write(LocalDirAllocator.java:355)
>>>> >       at
>>>>
>>>> 
>>>>>org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAl
>>>>>loca
>>>> >tor.java:150)
>>>> >       at
>>>>
>>>> 
>>>>>org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAl
>>>>>loca
>>>> >tor.java:131)
>>>> >       at
>>>>
>>>> 
>>>>>org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAl
>>>>>loca
>>>> >tor.java:115)
>>>> >       at
>>>>
>>>> 
>>>>>org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getL
>>>>>ocal
>>>> >PathForWrite(LocalDirsHandlerService.java:257)
>>>> >       at
>>>>
>>>> 
>>>>>org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.R
>>>>>esou
>>>>
>>>> 
>>>>>rceLocalizationService$LocalizerRunner.run(ResourceLocalizationService
>>>>>.jav
>>>> >a:849)
>>>> >
>>>> >However, I don't seem to find where in getLocalPathForWrite()
>>>>division
>>>> by
>>>> >zero could have arisen.
>>>> >
>>>> >Comment / hint is welcome.
>>>> >
>>>> >Thanks
>>>>
>>>>
>>>
>>


Mime
View raw message