hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruben Quintero <rfq_...@yahoo.com>
Subject Re: HBase & MapReduce & Zookeeper
Date Thu, 28 Jul 2011 20:16:06 GMT
This problem has come up a few times. There are leaked connections in the TIF


A quick and (very) dirty solution is to call deleteAllConnections(bool) at the 
end of your MapReduce jobs, or  periodically. If you have no other tables or 
pools, etc. open, then no  problem. If you do, they'll start throwing 
IOExceptions, but you can  re-instantiate them with a new config and then 
continue as usual. (You  do have to change the config or it'll simply grab the 
closed, cached one  from the HCM).

Another way: The leak comes from inside of TableInputFormat.setConf, where the 
Configuration gets cloned (so then it's hash in the HCM is lost):
    setHTable(new HTable(new Configuration(conf), tableName)); 

This is done to prevent changes to a config from affecting the job and 
vice-versa. If you're 100% sure the config won't be modified, you could subclass 
TIF to not make this copy.

For me, I didn't have extra tables hanging around, so I just blast them with 
deleteAllConnections. :)

- Ruben

From: Jeff Whiting <jeffw@qualtrics.com>
To: user@hbase.apache.org
Sent: Thu, July 28, 2011 12:10:16 PM
Subject: Re: HBase & MapReduce & Zookeeper

10 connection maximum is too low.  It has been recommended to go up to as many 
as 2000 connections 

in the list.  This doesn't fix your problem but is something you should probably 
have in your 



On 7/28/2011 10:00 AM, Stack wrote:
> Try getting the ZooKeeperWatcher from the connection on your way out
> and explicitly shutdown the zk connection (see TestZooKeeper unit test
> for example).
> St.Ack
> On Thu, Jul 28, 2011 at 6:01 AM, Andre Reiter<a.reiter@web.de>  wrote:
>> this issue is still not resolved...
>> unfortunatelly calling HConnectionManager.deleteConnection(conf, true);
>> after the MR job is finished, does not close the connection to the zookeeper
>> we have 3 zookeeper nodes
>> by default there is a limit of 10 connections allowed from a single client
>> so after running 30 MR jobs scheduled by our application, we have 30
>> unclosed connections, trying to start a new MR job results in a failure, the
>> connection to the zookeeper ensamble is droped...
>> the work around to restart the whole application after 30 MR jobss is not
>> very elegant... :-(

Jeff Whiting
Qualtrics Senior Software Engineer
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message