hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1961) HBase EC2 scripts
Date Thu, 19 Nov 2009 23:31:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780328#action_12780328

Andrew Purtell commented on HBASE-1961:

Feedback up on hbase-user@ from Naresh Rapolu:

Your scripts are working fine.  We restarted everything and  tested, and they are working
fine.  A few issues though :
-  While starting,  launch-hbase-cluster  gives the following  error.
  error:  "fs.epoll.max_user_instance"  is an unknown key.    It occurs during  starting zookeeper
-  We needed MapReduce along with HBase.  The note on the JIRA page that you only need to
add only two lines in hbase-ec2-env.sh    is insufficient.
  The following changes need to be made.
  1. hbase-ec2-env.sh  should write  mapred.job.tracker  property into  hadoop-site.xml  (
 Also shouldnt you be having  core-site.xml and hdfs-site.xml  as it is  hadoop-0.20.1 ???
 Infact because of this , there are warning messages all over the place when you are using
 hdfs  through command line ).
  2.  HADOOP_CLASSPATH  in  hadoop/conf/hadoop-env.sh  needs to be changed in the underlying
 AMI,  to include  hbase, zookeeper jars and conf directory.    Probably you can modify the
public AMI, and recreate the bundle  as the  paths to these are known apriori.  3.  For other
users,  the following three lines should be added in  hbase-ec2-env.sh
      For master:
      "$HADOOP_HOME"/bin/hadoop-daemon.sh start jobtracker
      "$HADOOP_HOME"/bin/hadoop-daemon.sh start tasktracker
      For slave:
      "$HADOOP_HOME"/bin/hadoop-daemon.sh start tasktracker.

Incorporate these suggestions. 

bq. error:  "fs.epoll.max_user_instance"  is an unknown key

This is a bit of future proofing. That's not a known sysctl key until kernel 2.6.27, at which
point oddly low epoll user descriptor limits go into effect. See http://pero.blogs.aprilmayjune.org/2009/01/22/hadoop-and-linux-kernel-2627-epoll-limits/.
At some point there may be a 2.6.27 based AKI. I could /dev/null the message but then some
other more serious potential problem with sysctl would be hidden.

bq. Also shouldnt you be having  core-site.xml and hdfs-site.xml  as it is  hadoop-0.20.1

Yes. What I did for this initial work is adapt the Hadoop EC2 scripts, which target 0.19.

> HBase EC2 scripts
> -----------------
>                 Key: HBASE-1961
>                 URL: https://issues.apache.org/jira/browse/HBASE-1961
>             Project: Hadoop HBase
>          Issue Type: New Feature
>         Environment: Amazon AWS EC2
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.21.0, 0.20.3
>         Attachments: ec2-contrib.tar.gz
> Attached tarball is a clone of the Hadoop EC2 scripts, modified significantly to start
up a HBase storage only cluster on top of HDFS backed by instance storage. 
> Tested with the HBase 0.20 branch but should work with trunk also. Only the AMI create
and launch scripts are tested. Will bring up a functioning HBase cluster. 
> Do "create-hbase-image c1.xlarge" to create an x86_64 AMI, or "create-hbase-image c1.medium"
to create an i386 AMI.  Public Hadoop/HBase 0.20.1 AMIs are available:
>     i386: ami-c644a7af
>     x86_64: ami-f244a79b
> launch-hbase-cluster brings up the cluster: First, a small dedicated ZK quorum, specifiable
in size, default of 3. Then, the DFS namenode (formatting on first boot) and one datanode
and the HBase master. Then, a specifiable number of slaves, instances running DFS datanodes
and HBase region servers.  For example:
> {noformat}
>     launch-hbase-cluster testcluster 100 5
> {noformat}
> would bring up a cluster with 100 slaves supported by a 5 node ZK ensemble.
> We must colocate a datanode with the namenode because currently the master won't tolerate
a brand new DFS with only namenode and no datanodes up yet. See HBASE-1960. By default the
launch scripts provision ZooKeeper as c1.medium and the HBase master and region servers as
c1.xlarge. The result is a HBase cluster supported by a ZooKeeper ensemble. ZK ensembles are
not dynamic, but HBase clusters can be grown by simply starting up more slaves, just like
> hbase-ec2-init-remote.sh can be trivially edited to bring up a jobtracker on the master
node and task trackers on the slaves.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message