hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Collins <Rob.Coll...@clickforensics.com>
Subject What did I do wrong? (Too many fetch-failures)
Date Thu, 12 Jun 2008 21:04:17 GMT
In a previous life, I had no problems setting up a small cluster. Now I have managed to mess
it up. I see reports of similar symptoms in this mailing list, but I cannot see any solutions
to the problem. I am working with 0.16.4.

Any suggestions you can provide would be appreciated.

Thanks,
Rob

Here is what I am seeing.

[root@NO01 ~]# sudo -u hadoop /hadoop/hadoop/bin/hadoop jar /hadoop/hadoop/hadoop-*-examples.jar
grep input output 'dfs[a-z.]+'
08/06/12 14:50:35 INFO mapred.FileInputFormat: Total input paths to process : 1
08/06/12 14:50:35 INFO mapred.JobClient: Running job: job_200806121449_0001
08/06/12 14:50:36 INFO mapred.JobClient:  map 0% reduce 0%
08/06/12 14:50:39 INFO mapred.JobClient:  map 54% reduce 0%
08/06/12 14:50:40 INFO mapred.JobClient:  map 72% reduce 0%
08/06/12 14:50:41 INFO mapred.JobClient:  map 100% reduce 0%
08/06/12 14:50:51 INFO mapred.JobClient:  map 100% reduce 2%
08/06/12 14:53:20 INFO mapred.JobClient:  map 100% reduce 3%
08/06/12 14:55:50 INFO mapred.JobClient:  map 100% reduce 4%
08/06/12 14:58:14 INFO mapred.JobClient:  map 100% reduce 5%
08/06/12 14:58:20 INFO mapred.JobClient:  map 100% reduce 6%
08/06/12 15:05:54 INFO mapred.JobClient:  map 100% reduce 7%
08/06/12 15:05:56 INFO mapred.JobClient:  map 100% reduce 10%
08/06/12 15:13:26 INFO mapred.JobClient:  map 100% reduce 11%
08/06/12 15:23:36 INFO mapred.JobClient:  map 90% reduce 11%
08/06/12 15:23:36 INFO mapred.JobClient: Task Id : task_200806121449_0001_m_000005_0, Status
: FAILED
Too many fetch-failures
08/06/12 15:23:36 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:23:36 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:23:38 INFO mapred.JobClient:  map 100% reduce 11%
08/06/12 15:31:07 INFO mapred.JobClient:  map 90% reduce 11%
08/06/12 15:31:07 INFO mapred.JobClient: Task Id : task_200806121449_0001_m_000008_0, Status
: FAILED
Too many fetch-failures
08/06/12 15:31:07 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:31:07 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:31:08 INFO mapred.JobClient:  map 100% reduce 11%
08/06/12 15:33:37 INFO mapred.JobClient: Task Id : task_200806121449_0001_m_000001_0, Status
: FAILED
Too many fetch-failures
08/06/12 15:33:37 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:33:37 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:36:12 INFO mapred.JobClient:  map 100% reduce 12%
08/06/12 15:38:38 INFO mapred.JobClient: Task Id : task_200806121449_0001_m_000002_0, Status
: FAILED
Too many fetch-failures
08/06/12 15:38:38 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:38:38 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:38:40 INFO mapred.JobClient: Task Id : task_200806121449_0001_m_000004_0, Status
: FAILED
Too many fetch-failures
08/06/12 15:38:40 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:38:40 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:38:43 INFO mapred.JobClient:  map 100% reduce 13%
08/06/12 15:41:14 INFO mapred.JobClient:  map 90% reduce 13%
08/06/12 15:41:14 INFO mapred.JobClient: Task Id : task_200806121449_0001_m_000010_0, Status
: FAILED
Too many fetch-failures
08/06/12 15:41:14 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:41:14 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:41:15 INFO mapred.JobClient:  map 100% reduce 13%
08/06/12 15:43:43 INFO mapred.JobClient:  map 100% reduce 15%
08/06/12 15:46:13 INFO mapred.JobClient: Task Id : task_200806121449_0001_m_000006_0, Status
: FAILED
Too many fetch-failures
08/06/12 15:46:13 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:46:13 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:48:49 INFO mapred.JobClient: Task Id : task_200806121449_0001_m_000009_0, Status
: FAILED
Too many fetch-failures
08/06/12 15:48:49 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:48:49 WARN mapred.JobClient: Error reading task outputConnection refused
08/06/12 15:53:46 INFO mapred.JobClient:  map 100% reduce 16%
08/06/12 15:56:20 INFO mapred.JobClient:  map 100% reduce 17%

2008-06-12 15:55:19,975 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:55:25,981 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:55:28,983 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:55:34,989 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:55:40,995 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:55:43,997 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:55:50,003 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:55:56,007 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:55:59,010 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:56:05,016 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:56:11,021 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:56:14,024 INFO org.apache.hadoop.mapred.TaskTracker: task_200806121449_0001_r_000002_0
0.21212122% reduce > copy (7 of 11 at 0.00 MB/s) >
2008-06-12 15:56:15,789 WARN org.apache.hadoop.mapred.TaskTracker: getMapOutput(task_200806121449_0001_m_000006_1,2)
failed :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find task_200806121449_0001_m_000006_1/file.out.index
in any of the configured local directories
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
        at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2253)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
        at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
        at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
        at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
        at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
        at org.mortbay.http.HttpServer.service(HttpServer.java:954)
        at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
        at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
        at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
        at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
        at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
        at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

[root@NO01 ~]# cat /hadoop/hadoop/conf/hadoop-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
  <name>hadoop.tmp.dir</name>
  <value>/hadoop/hadoop/tmp/hadoop-${user.name}</value>
  <description>A base for other temporary directories.</description>
</property>

<property>
  <name>dfs.data.dir</name>
  <value>/mnt/dsk1/dfs,/mnt/dsk2/dfs,/mnt/dsk3/dfs,/mnt/dsk4/dfs</value>
  <description>Determines where on the local filesystem an DFS data node
  should store its blocks.  If this is a comma-delimited
  list of directories, then data will be stored in all named
  directories, typically on different devices.
  Directories that do not exist are ignored.
  </description>
</property>

<property>
  <name>dfs.block.size</name>
  <value>67108864</value>
  <description>The default block size for new files.</description>
</property>

<property>
  <name>fs.default.name</name>
  <value>192.168.1.94:54310</value>
</property>


<property>
  <name>mapred.job.tracker</name>
  <value>192.168.1.94:54311</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>

<property>
  <name>mapred.local.dir</name>
  <!--value>/mnt/dsk1/local,/mnt/dsk2/local,/mnt/dsk3/local,/mnt/dsk4/local,/tmp</value-->
  <value>/tmp</value>
  <description>The local directory where MapReduce stores intermediate
  data files.  May be a comma-separated list of
  directories on different devices in order to spread disk i/o.
  Directories that do not exist are ignored.
  </description>
</property>

<property>
  <name>mapred.map.tasks</name>
  <value>11</value>
  <description>The default number of map tasks per job.  Typically set
  to a prime several times greater than number of available hosts.
  Ignored when mapred.job.tracker is "local".
  </description>
</property>

<property>
  <name>mapred.reduce.tasks</name>
  <value>3</value>
  <description>The default number of reduce tasks per job.  Typically set
  to a prime close to the number of available hosts.  Ignored when
  mapred.job.tracker is "local".
  </description>
</property>

<property>
  <name>mapred.tasktracker.map.tasks.maximum</name>
  <value>8</value>
  <description>The maximum number of map tasks that will be run
  simultaneously by a task tracker.
  </description>
</property>

<property>
  <name>mapred.tasktracker.reduce.tasks.maximum</name>
  <value>8</value>
  <description>The maximum number of reduce tasks that will be run
  simultaneously by a task tracker.
  </description>
</property>

<property>
  <name>mapred.child.java.opts</name>
  <value>-Xmx1800m</value>
  <description>Java opts for the task tracker child processes.
  The following symbol, if present, will be interpolated: @taskid@ is replaced
  by current TaskID. Any other occurrences of '@' will go unchanged.
  For example, to enable verbose gc logging to a file named for the taskid in
  /tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
        -Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc
  </description>

</property>

<property>
  <name>mapred.speculative.execution</name>
  <value>false</value>
  <description>If true, then multiple instances of some map and reduce tasks
              may be executed in parallel.
  </description>
</property>


</configuration>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message