hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nitay Joffe (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum
Date Thu, 02 Jul 2009 06:34:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726349#action_12726349

Nitay Joffe commented on HBASE-1551:

@andrew, see ZOOKEEPER-107 and possibly ZOOKEEPER-29.

@stack/cwensel, hbase management of ZK is in a bash/env thing. There's a setting in hbase-env.sh
called HBASE_MANAGES_ZK which toggles whether HBase will start/stop ZK.

@stack/andrew,cwensel, Yes the iterator thing along the line of what I was thinking of.
Here's my current thinking:
- Move all of the ZooKeeper config paraments into hbase-*.xml using zookeeper.property.KEY
- Add a special property for the list of quorum servers, say zookeeper.quorum. This option
can default to "localhost".
- If there is a zoo.cfg present in the classpath, use its data above the zookeeper.property.KEY
- When we need to instantiate something to talk to ZooKeeper, we simply create a new HBaseConfiguration
and call some method on it e.g. toZooKeeperProperties().
This method will iterate through the zookeeper.property.KEY and turn each into the appropriate
ZooKeeper configurations (i.e. KEY=VALUE). It will generate
the server.X property from the zookeeper.quorum configuration option. As mentioned above,
if there is a zoo.cfg in the classpath, overwrite the data with its configuration.
This will return a Properties object that can be used to construct the appropriate ZooKeeper
config and start/talk to their servers.
- For start/stop management of full ZK quorum cluster, use something like my ZKServerTool
in the patch (modified of course) to do the parsing mentioned above and turn it
into a simple line-by-line list of quorum servers. As I do in this patch, the bin/zookeepers.sh
can then simply call bin/hbase o.a.h.h.z.ZKServerTool to get the list of hosts.
If you want something like a conf/zookeepers you can simply run ZKServerTool yourself.

The benefits from all this are:
- One place for all ZK configuration. No duplicate setting of parameters.
- No more nasty zoo.cfg. Give the user what they're already used to, a single XML config file.
- New user only need edit zookeeper.quorum to get full cluster.
- Programmable control of what ZK one is talking to.

That's all I can think of for now.


> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option)
in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about
a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase
managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@ has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including
zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict
with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at
existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm
(and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode
whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing
a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@ has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works
fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers,
up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers,
i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things
other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist
and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off
the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for
a post install script to lay down a static hbase cluster config based on what it discovers
about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable
services on the cluster nodes according to their roles defined by hadoop/conf/masters and
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs
for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type
file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase
will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message