Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of threadedblue@gmail.com
 designates 74.125.82.41 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CALvqP3RbY7FnEhWZqC6gtO89kSPBP9CA8jjLnJuFkdJ2D3oqAg@mail.gmail.com>
References: 
 <CALvqP3Rs0SsatQ6o9ra79eZaAMS8SCTaA-=0BP4-bJhM7fcTpA@mail.gmail.com>
	<CAHVMgJhpf=UfTXYQhBFFLW=-dm+xPxH1xEGtydK9=Tyy5g-J5w@mail.gmail.com>
	<CALvqP3RbY7FnEhWZqC6gtO89kSPBP9CA8jjLnJuFkdJ2D3oqAg@mail.gmail.com>
Date: Thu, 13 Mar 2014 17:37:46 -0400
Message-ID: 
 <CAHVMgJgpxGhTW-wBPz2sCSLQ7qxwFhYHoGWc452VFfo7vLfqCA@mail.gmail.com>
Subject: Re: Reg: Setting up Hadoop Cluster
From: Geoffry Roberts <threadedblue@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=047d7ba97cde95724304f483c369

--047d7ba97cde95724304f483c369
Content-Type: text/plain; charset=ISO-8859-1

Did you not populate the "slaves" file when you did your installation?  In
older versions of hadoop (< 2.0),  there was a "master" file where you
entered your name node.  Now days there are multiple name nodes.  I haven't
worked with them as of yet.

I installed pig, for example, on my name node and ran it from there.


On Thu, Mar 13, 2014 at 5:22 PM, ados1984@gmail.com <ados1984@gmail.com>wrote:

> Thank you Geoffry,
>
> I have some fundamental question here.
>
>    1. Once I have installed Hadoop, how can i identify which nodes is
>    master node, which is slave?
>    2. My understanding is that master node is by default namenode and
>    slave node are data nodes, correct?
>    3. So i installed hadoop and i do not know which one is namenode and
>    which one id datanode then how can i go in and start run my jar from
>    namenode?
>    4. also when we do mapreduce programming, where do we write the
>    program on hadoop server (where we have nodes installed both
>    master/namenode and slaves/datanode) or in our local system using any
>    standard ide then package them together as jar and deploy it to name node,
>    but here again how can i identify which is name node and which is data node?
>    5. Ok, assumming, I have figured out which one is data node and which
>    one is namenode then how will my mapreduce program or pig or hive scripts
>    know that it needs to run on node 1 or node 2 or node 3?
>    6. also where do we install pig, hive and flume on hadoop
>    master/slaves nodes or somewhere else? and how do we let pig/hive know that
>    node 1 is master/namenode and other nodes are slaves or data nodes?
>
> I would really appreciate inputs on this questions as setting up hadoop is
> turning out to be a quite complex task from where i currently see it.
>
> Regards, Andy.
>
>
> On Thu, Mar 13, 2014 at 5:14 PM, Geoffry Roberts <threadedblue@gmail.com>wrote:
>
>> Andy,
>>
>> Once you have hadoop running,  You can run your jobs from the cli of the
>> name node. When I write a map reduce job, I jar it up. and place it in,
>> say, my home directory and run it from there.  I do the same with pig
>> scripts.  I've used neither hive nor cascading, but I imagine they would
>> work the same.
>>
>> Another approach I've tried is WebHDFS.  It's for manipulating the hdfs
>> via a restful interface.  It worked well enough for me.  I stopped using it
>> when I discovered it didn't support MapFiles but that's another story.
>>
>>
>> On Thu, Mar 13, 2014 at 5:00 PM, ados1984@gmail.com <ados1984@gmail.com>wrote:
>>
>>> Hello Team,
>>>
>>> I have one question regarding putting data into hdfs and running
>>> mapreduce on data present in hdfs.
>>>
>>>    1. hdfs is file system and so to interact with it what kind of
>>>    clients are available? also where do we need to install those client?
>>>    2. regarding pig, hive and mapreduce, where do we install them on
>>>    hadoop cluster and from where do we run all scripts and how does it
>>>    internally know that it needs to run on node 1, node2 or node 3?
>>>
>>> any inputs here would really helpful.
>>>
>>> Thanks, Andy.
>>>
>>
>>
>>
>> --
>> There are ways and there are ways,
>>
>> Geoffry Roberts
>>
>
>


-- 
There are ways and there are ways,

Geoffry Roberts

--047d7ba97cde95724304f483c369
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Did you not populate the &quot;slaves&quot; file when you =
did your installation? =A0In older versions of hadoop (&lt; 2.0), =A0there =
was a &quot;master&quot; file where you entered your name node. =A0Now days=
 there are multiple name nodes. =A0I haven&#39;t worked with them as of yet=
.<div>
<br></div><div>I installed pig, for example, on my name node and ran it fro=
m there. =A0</div></div><div class=3D"gmail_extra"><br><br><div class=3D"gm=
ail_quote">On Thu, Mar 13, 2014 at 5:22 PM, <a href=3D"mailto:ados1984@gmai=
l.com">ados1984@gmail.com</a> <span dir=3D"ltr">&lt;<a href=3D"mailto:ados1=
984@gmail.com" target=3D"_blank">ados1984@gmail.com</a>&gt;</span> wrote:<b=
r>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Thank you Geoffry,=A0<div><=
br></div><div>I have some fundamental question here.=A0</div><div><ol><li>O=
nce I have installed Hadoop, how can i identify which nodes is master node,=
 which is slave?=A0</li>
<li>My understanding is that master node is by default namenode and slave n=
ode are data nodes, correct?</li>

<li>So i installed hadoop and i do not know which one is namenode and which=
 one id datanode then how can i go in and start run my jar from namenode?</=
li><li>also when we do mapreduce programming, where do we write the program=
 on hadoop server (where we have nodes installed both master/namenode and s=
laves/datanode) or in our local system using any standard ide then package =
them together as jar and deploy it to name node, but here again how can i i=
dentify which is name node and which is data node?</li>


<li>Ok, assumming, I have figured out which one is data node and which one =
is namenode then how will my mapreduce program or pig or hive scripts know =
that it needs to run on node 1 or node 2 or node 3?</li><li>also where do w=
e install pig, hive and flume on hadoop master/slaves nodes or somewhere el=
se? and how do we let pig/hive know that node 1 is master/namenode and othe=
r nodes are slaves or data nodes?</li>


</ol><div>I would really appreciate inputs on this questions as setting up =
hadoop is turning out to be a quite complex task from where i currently see=
 it.=A0</div></div><div><br></div><div>Regards, Andy.=A0</div></div><div cl=
ass=3D"HOEnZb">
<div class=3D"h5"><div class=3D"gmail_extra">

<br><br><div class=3D"gmail_quote">On Thu, Mar 13, 2014 at 5:14 PM, Geoffry=
 Roberts <span dir=3D"ltr">&lt;<a href=3D"mailto:threadedblue@gmail.com" ta=
rget=3D"_blank">threadedblue@gmail.com</a>&gt;</span> wrote:<br><blockquote=
 class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc soli=
d;padding-left:1ex">


<div dir=3D"ltr">Andy,<div><br></div><div>Once you have hadoop running, =A0=
You can run your jobs from the cli of the name node. When I write a map red=
uce job, I jar it up. and place it in, say, my home directory and run it fr=
om there. =A0I do the same with pig scripts. =A0I&#39;ve used neither hive =
nor cascading, but I imagine they would work the same.</div>


<div><br></div><div>Another approach I&#39;ve tried is WebHDFS. =A0It&#39;s=
 for manipulating the hdfs via a restful interface. =A0It worked well enoug=
h for me. =A0I stopped using it when I discovered it didn&#39;t support Map=
Files but that&#39;s another story.</div>


</div><div class=3D"gmail_extra"><div><div><br><br><div class=3D"gmail_quot=
e">On Thu, Mar 13, 2014 at 5:00 PM, <a href=3D"mailto:ados1984@gmail.com" t=
arget=3D"_blank">ados1984@gmail.com</a> <span dir=3D"ltr">&lt;<a href=3D"ma=
ilto:ados1984@gmail.com" target=3D"_blank">ados1984@gmail.com</a>&gt;</span=
> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Hello Team,=A0<div><br></di=
v><div>I have one question regarding putting data into hdfs and running map=
reduce on data present in hdfs.=A0</div>


<div><ol><li>hdfs is file system and so to interact with it what kind of cl=
ients are available? also where do we need to install those client?</li>

<li>regarding pig, hive and mapreduce, where do we install them on hadoop c=
luster and from where do we run all scripts and how does it internally know=
 that it needs to run on node 1, node2 or node 3?</li></ol><div>any inputs =
here would really helpful.=A0</div>


</div><div><br></div><div>Thanks, Andy.=A0</div></div>
</blockquote></div><br><br clear=3D"all"><div><br></div></div></div><span><=
font color=3D"#888888">-- <br><div dir=3D"ltr"><div>There are ways and ther=
e are ways,=A0</div><div><br></div>Geoffry Roberts<br></div>
</font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<div dir=3D"ltr"><div>There are ways and there are ways,=A0</div><div><br><=
/div>Geoffry Roberts<br></div>
</div>

--047d7ba97cde95724304f483c369--