hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geoffry Roberts <threadedb...@gmail.com>
Subject Re: Reg: Setting up Hadoop Cluster
Date Thu, 13 Mar 2014 21:37:46 GMT
Did you not populate the "slaves" file when you did your installation?  In
older versions of hadoop (< 2.0),  there was a "master" file where you
entered your name node.  Now days there are multiple name nodes.  I haven't
worked with them as of yet.

I installed pig, for example, on my name node and ran it from there.


On Thu, Mar 13, 2014 at 5:22 PM, ados1984@gmail.com <ados1984@gmail.com>wrote:

> Thank you Geoffry,
>
> I have some fundamental question here.
>
>    1. Once I have installed Hadoop, how can i identify which nodes is
>    master node, which is slave?
>    2. My understanding is that master node is by default namenode and
>    slave node are data nodes, correct?
>    3. So i installed hadoop and i do not know which one is namenode and
>    which one id datanode then how can i go in and start run my jar from
>    namenode?
>    4. also when we do mapreduce programming, where do we write the
>    program on hadoop server (where we have nodes installed both
>    master/namenode and slaves/datanode) or in our local system using any
>    standard ide then package them together as jar and deploy it to name node,
>    but here again how can i identify which is name node and which is data node?
>    5. Ok, assumming, I have figured out which one is data node and which
>    one is namenode then how will my mapreduce program or pig or hive scripts
>    know that it needs to run on node 1 or node 2 or node 3?
>    6. also where do we install pig, hive and flume on hadoop
>    master/slaves nodes or somewhere else? and how do we let pig/hive know that
>    node 1 is master/namenode and other nodes are slaves or data nodes?
>
> I would really appreciate inputs on this questions as setting up hadoop is
> turning out to be a quite complex task from where i currently see it.
>
> Regards, Andy.
>
>
> On Thu, Mar 13, 2014 at 5:14 PM, Geoffry Roberts <threadedblue@gmail.com>wrote:
>
>> Andy,
>>
>> Once you have hadoop running,  You can run your jobs from the cli of the
>> name node. When I write a map reduce job, I jar it up. and place it in,
>> say, my home directory and run it from there.  I do the same with pig
>> scripts.  I've used neither hive nor cascading, but I imagine they would
>> work the same.
>>
>> Another approach I've tried is WebHDFS.  It's for manipulating the hdfs
>> via a restful interface.  It worked well enough for me.  I stopped using it
>> when I discovered it didn't support MapFiles but that's another story.
>>
>>
>> On Thu, Mar 13, 2014 at 5:00 PM, ados1984@gmail.com <ados1984@gmail.com>wrote:
>>
>>> Hello Team,
>>>
>>> I have one question regarding putting data into hdfs and running
>>> mapreduce on data present in hdfs.
>>>
>>>    1. hdfs is file system and so to interact with it what kind of
>>>    clients are available? also where do we need to install those client?
>>>    2. regarding pig, hive and mapreduce, where do we install them on
>>>    hadoop cluster and from where do we run all scripts and how does it
>>>    internally know that it needs to run on node 1, node2 or node 3?
>>>
>>> any inputs here would really helpful.
>>>
>>> Thanks, Andy.
>>>
>>
>>
>>
>> --
>> There are ways and there are ways,
>>
>> Geoffry Roberts
>>
>
>


-- 
There are ways and there are ways,

Geoffry Roberts

Mime
View raw message