hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mac Yuki <mcmcay...@gmail.com>
Subject Re: Cluster configuration issue
Date Thu, 16 Jul 2009 15:47:15 GMT
Just reporting back that after patching my cluster (base install of 0.19.1
with the single patch listed below) and using it heavily for more than a
day, the patch seems to have done the trick.  Every map still has all slots
available.

For any people who have only used the released versions of Hadoop -- i.e.
others like me who were only planning on using Hadoop, but are not actively
working on its development, so have not looked into that literature -- it is
much simpler to patch than I had imagined.  There are just several simple
steps.  These are probably listed somewhere, but I couldn't find a simple
site for non-developers that listed it this way:

1.  Get the patch file.  In my case it was
https://issues.apache.org/jira/secure/attachment/12400587/patch-5269-0.19-0.20.txt.
 Place it in the base directory of your Hadoop install (the one that has the
directories src/, bin/, etc)
2.  From the base directory of your Hadoop install, type "patch -p0 <
name_of_patch_file" where in my case name_of_patch_file was
patch-5269-0.19-0.20.txt.
3. Repeat 1 and 2 if you want to apply more patches.  Careful that the order
in which you apply them might matter.
4.  Type "ant tar" from the base directory.
5.  You will now have a directory build/ and inside there will be a file
named "hadoop-0.19.2-dev.tar.gz" (if you are patching 0.19.1, else it will
be named accordingly).  If you installed Hadoop by downloading a tar ball,
treat this new tar ball as you did the one you originally downloaded to
install it.

Thanks for your help, Tamir!

Mac


On Tue, Jul 14, 2009 at 10:24 PM, Tamir Kamara <tamirkamara@gmail.com>wrote:

> Hi,
>
> In order to patch use a command like this: "patch -p0 < patchfile" and then
> build with something like ant or "ant tar" (which will make it easy to
> distribute to the nodes).
>
> However, I think that version 0.19.2 is close to be released so if you you
> can wait it would be easier and you'll get a few more fixes. Maybe someone
> else can comment on when will it be released...
>
> Tamir
>
> On Wed, Jul 15, 2009 at 3:24 AM, Mac Yuki <mcmcayuki@gmail.com> wrote:
>
> > Thanks for the help!  That bug looks like the one I'm running into and
> the
> > code in my src/ directory matches the older one on the patch svn diff.
> >  Here's a beginner question: how do I apply just that patch to my
> > installation?  Thanks.Mac
> >
> >
> > On Tue, Jul 14, 2009 at 10:03 AM, Tamir Kamara <tamirkamara@gmail.com
> > >wrote:
> >
> > > Hi,
> > >
> > > To restart hadoop on a specific node use a command like this:
> > > "hadoop-daemon.sh stop tasktracker" and after that the same command
> with
> > > start. You can also do the same with the datanode but it doesn't look
> > like
> > > there's a problem there.
> > >
> > > I had the same problem with missing slots but I don't remeber if it
> > > happened
> > > on maps. The fix in my case was this patch:
> > > https://issues.apache.org/jira/browse/HADOOP-5269
> > >
> > >
> > > Tamir
> > >
> > > On Tue, Jul 14, 2009 at 7:25 PM, KTM <mcmcayuki@gmail.com> wrote:
> > >
> > > > Hi, I'm running Hadoop 0.19.1 on a cluster with 8 machines, 7 of
> which
> > > are
> > > > used as slaves and the other the master, each with 2 dual-core AMD
> CPUs
> > > and
> > > > generous amounts of RAM.  I am running map-only jobs and have the
> > slaves
> > > > set
> > > > up to have 4 mappers each, for a total of 28 available mappers.  When
> I
> > > > first start up my cluster, I am able to use all 28 mappers.  However,
> > > after
> > > > a short bit of time (~12 hours), jobs that I submit start using fewer
> > > > mappers.  I restarted my cluster last night, and currently only 19
> > > mappers
> > > > are running tasks even though more tasks are pending, with at least 2
> > > tasks
> > > > running per machine - so no machine has gone down.  I have checked
> that
> > > the
> > > > unused cores are actually sitting idle.  Any ideas for why this is
> > > > happening?  Is there a way to restart Hadoop on the individual
> slaves?
> > > >  Thanks! Mac
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message