hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jason hadoop <jason.had...@gmail.com>
Subject Re: Different Hadoop Home on Slaves?
Date Wed, 25 Feb 2009 00:25:16 GMT
If you manually start the daemons, via hadoop-daemon.sh, the parent
directory of the hadoop-daemon.sh script will be used as the root directory
for the hadoop installation.

I do believe, but do not know, that the namenode/jobtracker does does not
notice the actual file system location of the the tasktrackers and
datnaodes.

If you adjust the default path for each of your users such that the bin
directory of that user's hadoop installation is in the path, you could
ssh host -n -l userForHost -f hadoop-daemon.sh {start|stop}
{tasktracker|datanode}
for each of your hosts.


On Tue, Feb 24, 2009 at 3:37 PM, Hargraves, Alyssa <alyssa@wpi.edu> wrote:

> Hello Hadoop users,
>
> I have a question about having a different Hadoop Home directory on every
> slave machine. First, some background: right now, to get around an issue
> where we do not have a dedicated hadoop user, I am specifying an SSH config
> file in SSH_OPTS of hadoop-env.sh.  This config file has a host, user, and
> key so SSH knows what key to use and what user to connect as for each slave.
>  In our case, every slave is likely to have a different user.  This,
> unfortunately, is not something I can change, but the SSH Config file works
> and allows the server admin to SSH passphraseless to the home folder of the
> individual users at each client.
>
> The issue I'm running into is the fact that each client, since it has a
> unique user, also has a unique hadoop_home directory.  For example, UserB's
> hadoop_home directory is likely to be /home/UserB/hadoop-0.18.2 whereas the
> server's home directory is /home/UserA/hadoop-0.18.2.  This causes a problem
> because when you run bin/start-all.sh, it tries to start each server by
> SSHing (which works successfully) and then changing directory to the same
> directory as the server's hadoop home.  The specific SSH command that the
> server is running to start each node (as defined in slaves.sh) is "ssh -q -F
> /home/userA/ hadoop-0.18.2/bin/../sshconfig hostname "cd
> /home/userA/hadoop-0.18.2/bin/../conf start datanode"."
>
> The ultimate question that arises out of this is: Is there a way I can get
> it to read the Hadoop_Home from the slave node and then try to start the
> datanode?  Right now it's using Hadoop_Home of the server, and assuming it
> is the same as the slaves, which it probably never will be.  In the above
> example, changing userA to userB would work, but the users will vary.  It
> seems strange to me that there's nothing (that I can find) to account for
> varying Hadoop_Home locations on the nodes.
>
> I'd be happy to hear any suggestions that anybody has.  Unfortunately, the
> option of having a dedicated Hadoop user or a consistent Hadoop_Home are
> both out of my control.
>
> Thank you for any feedback,
> Alyssa Hargraves

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message