hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Reduce Error
Date Thu, 09 Dec 2010 05:05:34 GMT
>From Raj earlier:

I have seen this error from time to time and it has been either due to space
or
missing directories  or disk errors.

Space issue was caused by the fact that the I had mounted /de/sdc on
/hadoop-dsk
and the mount had failed. And in another case I had

accidentally deleted hadoop.tmp.dir  in a node and whenever  the reduce job
was
scheduled on that node that attempt would fail.

On Wed, Dec 8, 2010 at 8:21 PM, Adarsh Sharma <adarsh.sharma@orkash.com>wrote:

> Raj V wrote:
>
>> Go through the jobtracker, find the relevant node that handled
>> attempt_201012061426_0001_m_000292_0 and figure out
>> if there are FS or permssion problems.
>>
>> Raj
>>
>>
>> ________________________________
>> From: Adarsh Sharma <adarsh.sharma@orkash.com>
>> To: common-user@hadoop.apache.org
>> Sent: Wed, December 8, 2010 7:48:47 PM
>> Subject: Re: Reduce Error
>>
>>
>> Ted Yu wrote:
>>
>>
>>> Any chance mapred.local.dir is under /tmp and part of it got cleaned up ?
>>>
>>> On Wed, Dec 8, 2010 at 4:17 AM, Adarsh Sharma <adarsh.sharma@orkash.com
>>> >wrote:
>>>
>>>
>>>
>>>> Dear all,
>>>>
>>>> Did anyone encounter the below error while running job in Hadoop. It
>>>> occurs
>>>> in the reduce phase of the job.
>>>>
>>>> attempt_201012061426_0001_m_000292_0:
>>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
>>>> any
>>>> valid local directory for
>>>>
>>>> taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000292_0/output/file.out
>>>> t
>>>>
>>>> It states that it is not able to locate a file that is created in
>>>>  mapred.local.dir of Hadoop.
>>>>
>>>> Thanks in Advance for any sort of information regarding this.
>>>>
>>>> Best Regards
>>>>
>>>> Adarsh Sharma
>>>>
>>>>
>>>>
>>>
>>>
>> Hi Ted,
>>
>> My mapred.local.dir is in /home/hadoop directory. I also check it with in
>> /hdd2-2 directory where  we have lots of space.
>>
>> Would mapred.map.tasks affects.
>>
>> I checked with default and also with 80 maps and 16 reduces as I have 8
>> slaves.
>>
>>
>> <property>
>> <name>mapred.local.dir</name>
>> <value>/home/hadoop/mapred/local</value>
>> <description>The local directory where MapReduce stores intermediate
>> data files.  May be a comma-separated list of directories on different
>> devices in order to spread disk i/o.
>> Directories that do not exist are ignored.
>> </description>
>> </property>
>>
>> <property>
>> <name>mapred.system.dir</name>
>> <value>/home/hadoop/mapred/system</value>
>> <description>The shared directory where MapReduce stores control files.
>> </description>
>> </property>
>>
>> Any further information u want.
>>
>>
>> Thanks & Regards
>>
>> Adarsh Sharma
>>
>>
> Sir I read the tasktracker logs several times but not able to find any
> reason as they are not very useful. I attached with the mail of tasktracker.
> However I listed main portion.
> 2010-12-06 15:27:04,228 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_201012061426_0001_m_000000_1' to tip
> task_201012061426_0001_m_000000, for tracker 'tracker_ws37-user-lin:
> 127.0.0.1/127.0.0.1:60583'
> 2010-12-06 15:27:04,228 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing rack-local task task_201012061426_0001_m_000000
> 2010-12-06 15:27:04,229 INFO org.apache.hadoop.mapred.JobTracker: Removed
> completed task 'attempt_201012061426_0001_m_000000_0' from
> 'tracker_ws37-user-lin:127.0.0.1/127.0.0.1:60583'
> 2010-12-06 15:27:07,235 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201012061426_0001_m_000328_0: java.io.IOException: Spill failed
>   at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860)
>   at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
>   at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>   at
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:30)
>   at
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:19)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>   at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
> find any valid local directory for
> taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000328_0/output/spill16.out
>   at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343)
>   at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
>   at
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
>   at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)
>   at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686)
>   at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173)
>
> 2010-12-06 15:27:07,236 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201012061426_0001_m_000000_1: Error initializing
> attempt_201012061426_0001_m_000000_1:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> valid local directory for taskTracker/jobcache/job_201012061426_0001/job.xml
>   at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343)
>   at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
>   at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:750)
>   at
> org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1664)
>   at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:97)
>   at
> org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1629)
>
> Thanks & Regards
>
> Adarsh Sharma
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message