hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yanbo Liang <yanboha...@gmail.com>
Subject Re: Configuring Hadoop, HBase and Hive Cluster
Date Tue, 13 Nov 2012 03:54:42 GMT
I recommend you to deploy master node of HDFS, MR, HBase in different
servers which can get better performance.
An example scenario is:

1, Deploy zookeeper on each server or the  server1, server2, server3, and
they make up a zookeeper cluster of odd numbers.
2, Deploy HDFS NameNode, backup NN, MR JobTracker, HBase Master, backup
Master on the five servers separately.
3, Deploy each server with DataNode and TaskTracker.

I reference the deploy scenario of Facebook Message System.
http://www.slideshare.net/parallellabs/sigmod-realtime-hadooppresentation

2012/11/13 Dalia Sobhy <dalia.mohsobhy@hotmail.com>

> I do advise you to use Cloudera Manager its a very simple and opensource
> cluster configuration software..
>
> A good design is to run zookeeper on node1, node2, another node alone
>
> Sent from my iPhone
>
> On 2012-11-13, at 2:04 AM, "Hakan Bogay" <h.bogay@googlemail.com> wrote:
>
> > Hi,
> >
> > I am a newbie to Hadoop, HBase and Hive. I installed Hadoop, HBase and
> Hive
> > in pseudodistributed mode and everything works fine. Now I am planning to
> > set up an simple Hadoop Cluster (5 nodes) with Hive, HBase and ZooKeeper.
> > I´ve read several documentations and instructions before but i could not
> > find a good explanation for my question. I´m not sure, where to run all
> the
> > daemons. This is my consideration:
> >
> > *Node_1* (Master)
> >
> >   - NameNode
> >   - JobTrakcer
> >   - HBase Master
> >   -
> >
> >   ZooKeeper (Standalone node; managed by HBase)
> >
> >
> >
> > *Node_2* (Backup_Master)
> >
> >   -
> >
> >   SecondaryNameNode
> >
> >
> >
> > *Node_3* (Slave1)
> >
> >   - DataNode1
> >   - TaskTracker1
> >   -
> >
> >   RegionServer1
> >
> >
> >
> > *Node_4* (Slave2)
> >
> >   - DataNode2
> >   - TaskTracker2
> >   -
> >
> >   RegionServer2
> >
> >
> >
> > *Node_5* (Slave3)
> >
> >   - DataNode3
> >   - TaskTracker3
> >   - RegionServer3
> >
> >
> > I know, in production it is recommended to run ZooKeeper ensemble at an
> odd
> > number of nodes (seperate Cluster). But for a simple cluster, is it OK to
> > set up a standalone ZooKeeper node which runs on the master node?
> > Another question is regarding Hive: I know that Hive is a Hadoop client.
> > Should I also install Hive on the master node? Does it make sense?
> >
> > Thanks for all tips and comments!
> >
> > Hakan
> >
> > Note: I have just 5 machines to simulate a cluster.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message