hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: region doesn't split after 32+ GB
Date Wed, 29 Sep 2010 19:22:58 GMT
Matt,

Since you are using ZooKeeper already, conceivably you could keep a hosts file in ZooKeeper
somewhere, use a strategy for updates similar to what is done for implementing locking to
insure a new slave gets and updates the latest version "atomically", and use Twitcher to trigger
updates on each host: 
   http://github.com/twitter/twitcher

?

Best regards,

    - Andy


> From: Matt Corgan <mcorgan@hotpads.com>
> Subject: Re: region doesn't split after 32+ GB
> To: "user" <user@hbase.apache.org>
> Date: Wednesday, September 29, 2010, 11:30 AM
> Thanks for your help again Stack...
> sorry i don't have the logs.  Will do a
> better job of saving them.  By the way, this time the
> insert job maintained
> about 22k rows/sec all night without any pauses, and even
> though it was
> sequential insertion, it did a nice job of rotating the
> active region around
> the cluster.
> 
> As for the hostnames, there are no problems in .89, and
> nothing is onerous
> by any means... we are just trying to come to some level of
> familiarity
> before putting any real data into hbase.
> 
> EC2/RightScale make it very easy to add/remove
> regionservers to the cluster
> with the click of a button, which is the reason that the
> hosts file can
> change more often then you'd want to modify it
> manually.  We're going to go
> the route of having each newly added regionserver append
> it's name to the
> host file of every other server in our EC2 account (~30
> servers).  The only
> downsides I see there are that it doesn't scale very
> elegantly, and that it
> gets complicated if you want to launch multiple
> regionservers or new clients
> at the same time.
> 
> For the sake of brainstorming, maybe it's possible to have
> the master always
> broadcast IP addresses and have all communication done via
> IP.  This may be
> more robust anyway.  Then the first time a new
> regionserver or cient gets an
> unfamiliar IP address, it can try to figure out the
> hostname (the same way
> the master currently does this), and cache it
> somewhere.  The hostname could
> be added alongside the IP address or replace it in the logs
> for convenience.
> 
> Thanks again,
> Matt




      


Mime
View raw message