hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2656) Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.
Date Fri, 09 Sep 2011 04:29:09 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100961#comment-13100961
] 

Arun C Murthy commented on MAPREDUCE-2656:
------------------------------------------

Sorry, I meant to MAPREDUCE-2647 vis-a-vis 0.20.205.

My proposal is we drop this for 0.20.3 which is unlikely to be released now, afaics.

> Map Reduce Tasks are continously failing, when one among the several harddisks available
on the TaskTracker fails.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2656
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2656
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>    Affects Versions: 0.20.2, 0.20.3
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>             Fix For: 0.20.2, 0.20.3
>
>         Attachments: HADOOP-7130.patch, MAPREDUCE-2656.patch
>
>
> 1. Pull out one hard disk from Task tracker node (out of 10 disks pull one). Now it is
noted that some jobs are failing. 
> However process is continued. 
> 2. Wait for sometime (15 mins) and pull out one disk from another Task tracker. 
> 3. More number of jobs failed now and it can be seen from UI. Process is getting paused.
> The exception can be seen in the job tracker UI for a failed job.
> {code:xml} 
> Error initializing attempt_201010221528_10174_m_000011_0:
> java.io.IOException: Expecting a line not the end of stream
>  at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>  at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>  at org.apache.hadoop.util.Shell.run(Shell.java:137)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:385)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> Error initializing attempt_201010221528_10174_m_000011_1:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local
directory for taskTracker/jobcache/job_201010221528_10174/work
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:454)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> {code} 
> Task Tracker log can be seen here :
> {code:xml} 
> 2010-10-25 16:36:24,215 ERROR mapred.TaskTracker (TaskTracker.java:offerService(1211))
- Caught exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> 2010-10-25 16:36:24,216 INFO  mapred.TaskTracker (TaskTracker.java:run(1856)) - Lost
connection to JobTracker [/192.168.97.1:9001].  Retrying...
> java.lang.Exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1212)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> Caused by: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         ... 2 more
> 2010-10-25 16:36:29,550 INFO  mapred.TaskTracker (TaskTracker.java:transmitHeartBeat(1256))
- Resending 'status' to '192.168.97.1' with reponseId '18361
> 2010-10-25 16:36:29,550 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982))
- Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> 2010-10-25 16:36:32,656 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982))
- Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> {code} 
> This seems to be fixed in the trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message