hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Sonnenberg <steveis...@gmail.com>
Subject Re: Fail to start mapreduce tasks across nodes
Date Mon, 23 Jul 2012 18:19:15 GMT
I will try this.

For the HDFS,
The WebUI M/R Admin, using 50030 (from the Apache example) shows 2 nodes
registered.
All of the jobs shows completed only on one of the nodes.
I will package up a set of clean logs.

Thanks
-s

On Mon, Jul 23, 2012 at 2:08 PM, Harsh J <harsh@cloudera.com> wrote:

> Steve,
>
> If you're going to use NFS, make sure your "hadoop.tmp.dir" property
> points to the mount point that is NFS. Can you change that property
> and restart the cluster and retry?
>
> Regarding the HDFS issue, its hard to tell without logs. Did you see
> two nodes alive in the Web UI after configuring HDFS for two nodes and
> configuring MR to use HDFS?
>
> On Mon, Jul 23, 2012 at 11:23 PM, Steve Sonnenberg <steveisoft@gmail.com>
> wrote:
> > Thanks Harsh,
> >
> > 1) I was using NFS
> > 2) I don't believe that anything under /tmp is distributed even when
> running
> > 3) When I use HDFS, it doesn't attempt to send ANY jobs to my second node
> >
> > Any clues?
> >
> > -steve
> >
> >
> > On Fri, Jul 20, 2012 at 11:52 PM, Harsh J <harsh@cloudera.com> wrote:
> >>
> >> A 2-node cluster is a fully-distributed cluster and cannot use a
> >> file:/// FileSystem as thats not a distributed filesystem (unless its
> >> an NFS mount). This explains why some of your tasks aren't able to
> >> locate an earlier written file on the /tmp dir thats probably
> >> available on the JT node alone, not the TT nodes.
> >>
> >> Use hdfs:// FS for fully-distributed operation.
> >>
> >> On Fri, Jul 20, 2012 at 10:06 PM, Steve Sonnenberg <
> steveisoft@gmail.com>
> >> wrote:
> >> > I have a 2-node Fedora system and in cluster mode, I have the
> following
> >> > issue that I can't resolve.
> >> >
> >> > Hadoop 1.0.3
> >> > I'm running with filesystem, file:/// and invoking the simple 'grep'
> >> > example
> >> >
> >> > hadoop jar hadoop-examples-1.0.3.jar grep inputdir outputdir
> >> > simple-pattern
> >> >
> >> > The initiator displays
> >> >
> >> > Error initializing attempt_201207201103_0003_m_000004_0:
> >> >    java.io.FileNotFoundException: File
> >> > file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken
> >> > does
> >> > not exist.
> >> >      getFileStatus(RawLocalFileSystem.java)
> >> >      localizeJobTokenFile(TaskTracker.java:4268)
> >> >      initializeJob(TaskTracker.java:1177)
> >> >      localizeJob
> >> >      run
> >> >
> >> > The /tmp/hadoop-hadoop/mapred/system directory only contains a
> >> > 'jobtracker.info' file (on all systems)
> >> >
> >> > On the target system, in the tasktracker log file, I get the
> following:
> >> >
> >> > 2012-07-20 11:35:59,954 DEBUG org.apache.hadoop.mapred.TaskTracker:
> Got
> >> > heartbeatResponse from JobTracker with responseId: 641 and 1 actions
> >> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:
> >> > LaunchTaskAction (registerTask): attempt_201207201103_0003_m_000006_0
> >> > task's
> >> > state:UNASSIGNED
> >> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:
> >> > Trying to
> >> > launch : attempt_201207201103_0003_m_000006_0 which needs 1 slots
> >> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: In
> >> > TaskLauncher, current free slots : 2 and trying to launch
> >> > attempt_201207201103_0003_m_000006_0 which needs 1 slots
> >> > 2012-07-20 11:35:59,955 WARN org.apache.hadoop.mapred.TaskTracker:
> Error
> >> > initializing attempt_201207201103_0003_m_000006_0:
> >> > java.io.FileNotFoundException: File
> >> > file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken
> >> > does
> >> > not exist.
> >> >         at
> >> >
> >> >
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
> >> >         at
> >> >
> >> >
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
> >> >         at
> >> >
> >> >
> org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4268)
> >> >         at
> >> >
> >> >
> org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1177)
> >> >         at
> >> >
> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
> >> >         at
> >> > org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
> >> >         at java.lang.Thread.run(Thread.java:636)
> >> >
> >> > 2012-07-20 11:35:59,955 ERROR org.apache.hadoop.mapred.TaskStatus:
> >> > Trying to
> >> > set finish time for task attempt_201207201103_0003_m_000006_0 when no
> >> > start
> >> > time is set, stackTrace is : java.lang.Exception
> >> >         at
> >> > org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145)
> >> >         at
> >> >
> >> >
> org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:3142)
> >> >         at
> >> > org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2440)
> >> >         at java.lang.Thread.run(Thread.java:636)
> >> >
> >> > On both systems, ownership of all files directories under
> >> > /tmp/hadoop-hadoop
> >> > is the user/group hadoop/hadoop.
> >> >
> >> >
> >> > Any ideas?
> >> >
> >> > Thanks
> >> >
> >> >
> >> > --
> >> > Steve Sonnenberg
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
> >
> >
> > --
> > Steve Sonnenberg
> >
>
>
>
> --
> Harsh J
>



-- 
Steve Sonnenberg

Mime
View raw message