hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adarsh Sharma <adarsh.sha...@orkash.com>
Subject Re: Reduce Error
Date Thu, 09 Dec 2010 08:55:00 GMT
Ted Yu wrote:
> From Raj earlier:
>
> I have seen this error from time to time and it has been either due to space
> or
> missing directories  or disk errors.
>
> Space issue was caused by the fact that the I had mounted /de/sdc on
> /hadoop-dsk
> and the mount had failed. And in another case I had
>
> accidentally deleted hadoop.tmp.dir  in a node and whenever  the reduce job
> was
> scheduled on that node that attempt would fail.
>
> On Wed, Dec 8, 2010 at 8:21 PM, Adarsh Sharma <adarsh.sharma@orkash.com>wrote:
>
>   
>> Raj V wrote:
>>
>>     
>>> Go through the jobtracker, find the relevant node that handled
>>> attempt_201012061426_0001_m_000292_0 and figure out
>>> if there are FS or permssion problems.
>>>
>>> Raj
>>>
>>>
>>> ________________________________
>>> From: Adarsh Sharma <adarsh.sharma@orkash.com>
>>> To: common-user@hadoop.apache.org
>>> Sent: Wed, December 8, 2010 7:48:47 PM
>>> Subject: Re: Reduce Error
>>>
>>>
>>> Ted Yu wrote:
>>>
>>>
>>>       
>>>> Any chance mapred.local.dir is under /tmp and part of it got cleaned up ?
>>>>
>>>> On Wed, Dec 8, 2010 at 4:17 AM, Adarsh Sharma <adarsh.sharma@orkash.com
>>>>         
>>>>> wrote:
>>>>>           
>>>>
>>>>         
>>>>> Dear all,
>>>>>
>>>>> Did anyone encounter the below error while running job in Hadoop. It
>>>>> occurs
>>>>> in the reduce phase of the job.
>>>>>
>>>>> attempt_201012061426_0001_m_000292_0:
>>>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
>>>>> any
>>>>> valid local directory for
>>>>>
>>>>> taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000292_0/output/file.out
>>>>> t
>>>>>
>>>>> It states that it is not able to locate a file that is created in
>>>>>  mapred.local.dir of Hadoop.
>>>>>
>>>>> Thanks in Advance for any sort of information regarding this.
>>>>>
>>>>> Best Regards
>>>>>
>>>>> Adarsh Sharma
>>>>>
>>>>>
>>>>>
>>>>>           
>>>>         
>>> Hi Ted,
>>>
>>> My mapred.local.dir is in /home/hadoop directory. I also check it with in
>>> /hdd2-2 directory where  we have lots of space.
>>>
>>> Would mapred.map.tasks affects.
>>>
>>> I checked with default and also with 80 maps and 16 reduces as I have 8
>>> slaves.
>>>
>>>
>>> <property>
>>> <name>mapred.local.dir</name>
>>> <value>/home/hadoop/mapred/local</value>
>>> <description>The local directory where MapReduce stores intermediate
>>> data files.  May be a comma-separated list of directories on different
>>> devices in order to spread disk i/o.
>>> Directories that do not exist are ignored.
>>> </description>
>>> </property>
>>>
>>> <property>
>>> <name>mapred.system.dir</name>
>>> <value>/home/hadoop/mapred/system</value>
>>> <description>The shared directory where MapReduce stores control files.
>>> </description>
>>> </property>
>>>
>>> Any further information u want.
>>>
>>>
>>> Thanks & Regards
>>>
>>> Adarsh Sharma
>>>
>>>
>>>       
>> Sir I read the tasktracker logs several times but not able to find any
>> reason as they are not very useful. I attached with the mail of tasktracker.
>> However I listed main portion.
>> 2010-12-06 15:27:04,228 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> task 'attempt_201012061426_0001_m_000000_1' to tip
>> task_201012061426_0001_m_000000, for tracker 'tracker_ws37-user-lin:
>> 127.0.0.1/127.0.0.1:60583'
>> 2010-12-06 15:27:04,228 INFO org.apache.hadoop.mapred.JobInProgress:
>> Choosing rack-local task task_201012061426_0001_m_000000
>> 2010-12-06 15:27:04,229 INFO org.apache.hadoop.mapred.JobTracker: Removed
>> completed task 'attempt_201012061426_0001_m_000000_0' from
>> 'tracker_ws37-user-lin:127.0.0.1/127.0.0.1:60583'
>> 2010-12-06 15:27:07,235 INFO org.apache.hadoop.mapred.TaskInProgress: Error
>> from attempt_201012061426_0001_m_000328_0: java.io.IOException: Spill failed
>>   at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860)
>>   at
>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
>>   at
>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>   at
>> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:30)
>>   at
>> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:19)
>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>   at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
>> find any valid local directory for
>> taskTracker/jobcache/job_201012061426_0001/attempt_201012061426_0001_m_000328_0/output/spill16.out
>>   at
>> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343)
>>   at
>> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
>>   at
>> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
>>   at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)
>>   at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686)
>>   at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173)
>>
>> 2010-12-06 15:27:07,236 INFO org.apache.hadoop.mapred.TaskInProgress: Error
>> from attempt_201012061426_0001_m_000000_1: Error initializing
>> attempt_201012061426_0001_m_000000_1:
>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
>> valid local directory for taskTracker/jobcache/job_201012061426_0001/job.xml
>>   at
>> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343)
>>   at
>> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
>>   at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:750)
>>   at
>> org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1664)
>>   at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:97)
>>   at
>> org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1629)
>>
>> Thanks & Regards
>>
>> Adarsh Sharma
>>
>>
>>     
Thanks to all for your replies but I fix this issue by setting the below 
property to /hdd-1/tmp

This error occurs due to less space in mapred.child.tmp  directory.

<property>
  <name>mapred.child.tmp</name>
  <value>./tmp</value>
  <description> To set the value of tmp directory for map and reduce tasks.
  If the value is an absolute path, it is directly assigned. Otherwise, 
it is
  prepended with task's working directory. The java tasks are executed with
  option -Djava.io.tmpdir='the absolute path of the tmp dir'. Pipes and
  streaming are set with environment variable,
   TMPDIR='the absolute path of the tmp dir'
  </description>
</property>


With Best Regards
Adarsh Sharma



Mime
View raw message