hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Striffeler <a.striffe...@students.unibe.ch>
Subject Re: hadoop installation in pseudo distributed mode regular user vs dedicated user
Date Thu, 06 Aug 2015 06:58:45 GMT
Hi Arvind

I can't fully answer your questions on how to install Hadoop in 
pseudo-distributed mode, but I can kind of invalidate your cons: By 
using sudo su <user> in your shell, you can easily users during a 
session. Giving the hadoop-user access to your directories should then 
be an issue of two minutes at max...

Cheers,
alex


On 05.08.2015 19:32, Arvind Sundararajan wrote:
>
> Hi All,
>
> I have a laptop running Ubuntu 14.04 LTS and am trying to install 
> hadoop 2.7.1 (current stable version) in pseudo-distributed mode.
>
> I have a regular user account on my laptop, but am confused if i 
> should install hadoop using a dedicated hadoop user on my laptop.
> NOTE: By 'regular user', i mean the linux user account that i use for 
> day-to-day personal work
>
> The current hadoop documentation at [1] does not mention setting up a 
> dedicated user for hadoop installation.
>
> However, the hadoop installation tutorial at [2] mentions setting up a 
> dedicated user for hadoop installation in pseudo-distributed mode on a 
> single machine. This tutorial references an outdated hadoop 
> installation tutorial [3] which too mentions setting up a dedicated 
> user for hadoop installation in pseudo-distributed mode on a single 
> machine.
>
> I found several tutorials online which all seem to mention setting up 
> dedicated user for hadoop installation in pseudo-distributed mode on a 
> single machine, without mentioning why we should set up a dedicated user.
>
> My questions are as follows:
>
> a) Is it possible for me to execute hadoop programs as a regular user 
> even if hadoop is installed in pseudo-distributed mode via a dedicated 
> 'hadoop' user?
> If yes, what linux filesystem folder permissions and HDFS permissions 
> do i need to give to the regular user for executing hadoop programs?
>
> b) Quoting from the outdated hadoop installation tutorial [3]:
>
> |     "We will use a dedicated Hadoop user account for running Hadoop.
>       While that's not required it is recommended because it helps to separate
>       the Hadoop installation from other software applications and
>       user accounts running on the same machine
>       (think: security, permissions, backups, etc)."
> |
>
> Can someone elaborate on this? what are the issues regarding security, 
> permissions, backups when running hadoop in pseudo-distributed mode on 
> a single laptop which will most likely have only one user account (my 
> current user account) ?
>
> c) Can someone please elaborate on the pros and cons of running hadoop 
> in pseudo-distributed mode on a single machine as the regular user 
> versus creating a dedicated user?
>
> My thoughts on the cons, thus far has been:
>
> |     i) if hadoop is unable to execute from a 'regular user' and
>      only works from the dedicated hadoop user account, then i
>      will have to edit my hadoop java programs from my
>      'regular user' account where i have my development environment
>      and IDE/text editor setup, copy the .jar files to the
>      dedicated hadoop user account and execute. if any error occurs,
>      i have to go back to the 'regular user' account, edit and
>      then copy the new .jar files and execute again. this moving
>      back and forth between accounts is a definite pain while
>      working in pseudo-distributed mode and i have experienced
>      this while working in Hadoop 1.x version
>
>      ii) if hadoop is unable to execute from a 'regular user' and
>      only works from the dedicated hadoop user account, then
>      the hadoop operations copyFromLocal and copyToLocal will
>      require a shared folder for both user accounts.
> |
>
> P.S. I also referred [4] and [5] before asking this question.
>
> References:
>
> [1] 
> http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/SingleCluster.html
> [2] http://dogdogfish.com/big-data/installing-hadoop-2-4-on-ubuntu-14-04/
> [3] 
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
> [4] 
> http://stackoverflow.com/questions/20192140/hadoop-pseudo-distributed-mode-for-multiple-users
> [5] 
> http://stackoverflow.com/questions/23807486/hadoop-development-dedicated-user-in-ubuntu-how-to-access-hadoop-node-running
>


Mime
View raw message