hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-4148) DiskChecker$DiskErrorException
Date Sat, 20 Sep 2008 21:17:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Owen O'Malley updated HADOOP-4148:
----------------------------------

    Fix Version/s:     (was: 0.17.2)

17.2 has already been released.

> DiskChecker$DiskErrorException
> ------------------------------
>
>                 Key: HADOOP-4148
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4148
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.17.2
>         Environment: 2 systems
> 1- redhat 
> 1- ubuntu
>            Reporter: chandravadana
>            Priority: Blocker
>
> hi
> 1- redhat - master( jobtracker + namenode+ tasktracker + datanode)
> 1- ubuntu - slave ( tasktracker + datanode)
> when i execute
>  bin/hadoop jar word/word.jar org.myorg.WordCount in mn2
> 08/09/10 15:12:56 INFO mapred.FileInputFormat: Total input paths to process : 5
> 08/09/10 15:12:56 INFO mapred.JobClient: Running job: job_200809101511_0003
> 08/09/10 15:12:57 INFO mapred.JobClient:  map 0% reduce 0%
> 08/09/10 15:13:00 INFO mapred.JobClient:  map 20% reduce 0%
> 08/09/10 15:13:01 INFO mapred.JobClient:  map 80% reduce 0%
> 08/09/10 15:13:02 INFO mapred.JobClient:  map 100% reduce 0%
> 08/09/10 15:13:11 INFO mapred.JobClient:  map 100% reduce 13%
> 08/09/10 15:30:41 INFO mapred.JobClient:  map 80% reduce 13%
> 08/09/10 15:30:41 INFO mapred.JobClient: Task Id : task_200809101511_0003_m_000000_0,
Status : FAILED
> Too many fetch-failures
> 08/09/10 15:30:42 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&taskid=task_200809101511_0003_m_000000_0&filter=stdout
> 08/09/10 15:30:42 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&taskid=task_200809101511_0003_m_000000_0&filter=stderr
> 08/09/10 15:30:44 INFO mapred.JobClient:  map 100% reduce 13%
> 08/09/10 15:30:49 INFO mapred.JobClient:  map 100% reduce 20%
> 08/09/10 15:40:52 INFO mapred.JobClient: Task Id : task_200809101511_0003_m_000004_0,
Status : FAILED
> Too many fetch-failures
> 08/09/10 15:40:52 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&taskid=task_200809101511_0003_m_000004_0&filter=stdout
> 08/09/10 15:40:52 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&taskid=task_200809101511_0003_m_000004_0&filter=stderr
> 08/09/10 15:41:03 INFO mapred.JobClient:  map 100% reduce 26%
> it halts
> when i saw the tasktracker's log, i found
>  getMapOutput(task_200809101511_0003_m_000004_0,0) failed :
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200809101511_0003/task_200809101511_0003_m_000004_0/output/file.out.index
in any of the configured local directories
> 	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359)
> 	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
> 	at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2315)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
> 	at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
> 	at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
> 	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
> 	at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
> 	at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
> 	at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
> 	at org.mortbay.http.HttpServer.service(HttpServer.java:954)
> 	at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
> 	at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
> 	at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
> 	at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
> 	at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
> 	at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> 2008-09-10 15:33:12,915 WARN org.apache.hadoop.mapred.TaskTracker: Unknown child with
bad map output: task_200809101511_0003_m_000004_0. Ignored.
> 2008-09-10 15:33:17,425 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:33:23,431 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:33:29,437 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:33:32,439 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:33:38,445 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:33:44,451 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:33:47,454 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:33:53,460 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:33:59,465 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:34:02,469 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:34:08,475 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:34:14,480 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:34:17,484 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:34:23,490 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:34:29,495 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> 2008-09-10 15:34:32,498 INFO org.apache.hadoop.mapred.TaskTracker: task_200809101511_0003_r_000000_0
0.20000002% reduce > copy (3 of 5 at 0.00 MB/s) > 
> reducer task runs on master(redhat)
> the task_200809101511_0003_m_000004_0/ specified in the log was done in slave(ubuntu)
> in jobtracker's log, i found
> 2008-09-10 15:35:46,977 INFO org.apache.hadoop.mapred.JobInProgress: Failed fetch notification
#2 for task task_200809101511_0003_m_000004_0
> hadoop-site.xml
> <configuration>
> <property>
>     <name>fs.default.name</name>
>     <value>hdfs://master:54310/</value>
> <final>true</final>
>   </property>
>   <property>
>     <name>mapred.job.tracker</name>
>     <value>master:54311</value>
> <final>true</final>
>   </property>
>   <property>
>     <name>dfs.replication</name>
>     <value>2</value>
> <final>true</final>
>   </property> 
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>absolute path</value>
>   <final>true</final>
> </property>
> <property>
>   <name>mapred.child.java.opts</name>
>   <value>-Xmx512M</value>
> <final>true</final>
> </property>
> <property>
> <name>mapred.speculative.execution</name>
> <value>false</value>
> <final>true</final>
> </property>
> </configuration>
> i dont know where i went wrong ..
> kindly help me solving this
> thanks 
> Chandravadana

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message