hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: Heads up, HTablePool will be deprecated in 0.94, 0.95/0.96, and removed in 0.98
Date Mon, 05 Aug 2013 02:56:26 GMT
Let's do a little quiz:

HTable t1 = new HTable(conf);
t1.close();

// 1. Will the next line create a new HConnection behind the scenes (along with re-creating
all the caches)?
// (If so, it will be expensive, if not, when is the first HConnection actually released?)
HTable t2 = new HTable(conf);

// 2. how about this one?
HTable t2 = new HTable(new Configuration(conf));

// 3. or now?
conf.setInt(HConstants.HBASE_CLIENT_PAUSE, 2000);
HTable t3 = new HTable(conf);

// 4. and now?
conf.setInt(HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE_KEY, 1024000);
HTable t4 = new HTable(conf);

// 5. how many connections are opened now?
t4.close();

This stuff is convoluted and needlessly complicated. And this is not because the code is bad,
but because the abstraction is simply inadequate.
A client wants to connect to a cluster and then do some action on that cluster (via HTable
as a convenience).
If the cluster connection is implicit it leads to all of the above considerations.

(#1: Yes, #2: no, #3: yes, #4: no, #5: I don't really know, id'd have run it to see)

-- Lars





________________________________
 From: Ted Yu <yuzhihong@gmail.com>
To: lars hofhansl <larsh@apache.org> 
Cc: "dev@hbase.apache.org" <dev@hbase.apache.org> 
Sent: Sunday, August 4, 2013 7:39 PM
Subject: Re: Heads up, HTablePool will be deprecated in 0.94, 0.95/0.96, and removed in 0.98
 


In the Connections "managing" HTables case, don't we need to figure out when an HConnection
should be released ?


On Sun, Aug 4, 2013 at 7:23 PM, lars hofhansl <larsh@apache.org> wrote:

Just look at HConnectionKey part, and hoops we go through to detect whether HConnections are
the same or not, when to cache them, when/how to release them.
>In fact almost all HConnectionManager does is managing HConnections on behalf of HTable,
when it should be other way around.
>
>Typically, when things get hard to explain (check out the comments in HConnectionManager)
there is either an abstraction missing, or the abstraction is not right.
>The reverse (Connections "managing" HTables) has none of this.
>
>
>-- Lars
>
>
>_______________________________
>From: Ted Yu <yuzhihong@gmail.com>
>To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org>
>Sent: Sunday, August 4, 2013 4:27 PM
>
>Subject: Re: Heads up, HTablePool will be deprecated in 0.94, 0.95/0.96, and removed in
0.98
>
>
>
>bq. no funny business with unique Configurations
>
>Mind telling us what is funny about this part ?
>
>
>On Sat, Aug 3, 2013 at 10:41 PM, lars hofhansl <larsh@apache.org> wrote:
>
>Correct. The HConnection is naturally shared between the HTables.
>>There is no longer any need to worry about this (no funny business with unique Configurations,
in fact most of the code in HConnectionManager can be removed in trunk).
>>
>>It is also correct that the code now has to hold on the created HConnection, rather
asking HConnectionManager for it.
>>
>>-- Lars
>>
>>
>>
>>________________________________
>> From: Nick Dimiduk <ndimiduk@gmail.com>
>>To: dev@hbase.apache.org
>>Sent: Saturday, August 3, 2013 8:56 PM
>>
>>Subject: Re: Heads up, HTablePool will be deprecated in 0.94, 0.95/0.96, and removed
in 0.98
>>
>>
>>On Sat, Aug 3, 2013 at 8:52 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>
>>> Does this mean that user code wouldn't be able to depend
>>> on HConnectionManager for connection sharing ?
>>>
>>
>>My read of the above is that the HConnection instance is shared across
>>consumers, is the shared connection. Am I reading that correctly?
>>
>>On Sat, Aug 3, 2013 at 7:20 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>>>
>>> > Ah, I find the JIRA - HBASE-9117.
>>> >
>>> > Cheers
>>> >
>>> >
>>> > On Fri, Aug 2, 2013 at 10:54 PM, lars hofhansl <larsh@apache.org>
wrote:
>>> >
>>> >> Yeah, I filed a separate ticket for the API removal in trunk.
>>> >>
>>> >>
>>> >>
>>> >> ________________________________
>>> >>  From: Ted Yu <yuzhihong@gmail.com>
>>> >> To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org>
>>> >> Sent: Friday, August 2, 2013 10:31 PM
>>> >> Subject: Re: Heads up, HTablePool will be deprecated in 0.94, 0.95/0.96,
>>> >> and removed in 0.98
>>> >>
>>> >>
>>> >> bq. HConnectionManager.getConnection() will be removed.
>>> >>
>>> >> I don't see the above change in 6580-trunk.txt
>>> >> Would the above be done in next patch or in another JIRA ?
>>> >>
>>> >> Cheers
>>> >>
>>> >> On Fri, Aug 2, 2013 at 9:29 PM, lars hofhansl <larsh@apache.org>
wrote:
>>> >>
>>> >> > See. https://issues.apache.org/jira/browse/HBASE-6580
>>> >> >
>>> >> > The new proposed API looks like this:
>>> >> >
>>> >> > Here's the proposed new API:
>>> >> > * HConnectionManager:
>>> >> >     public static HConnection createConnection(Configuration
conf)
>>> >> >     public static HConnection createConnection(Configuration
conf,
>>> >> > ExecutorService pool)
>>> >> >
>>> >> > * HConnection:
>>> >> >     public HTableInterface getTable(byte[] tableName) throws
>>> IOException
>>> >> >     public HTableInterface getTable(byte[] tableName, ExecutorService
>>> >> > pool) throws IOException
>>> >> >     public HTableInterface getTable(String tableName) throws
>>> IOException
>>> >> >
>>> >> > By default HConnectionImplementation will create an ExecutorService
>>> when
>>> >> > needed. The ExecutorService can optionally passed be passed in.
>>> >> > HTableInterfaces are retrieved from the HConnection. By default
the
>>> >> > HConnection's ExecutorService is used, but optionally that can
be
>>> >> > overridden for each HTable.
>>> >> >
>>> >> > In 0.98/trunk:
>>> >> >
>>> >> > 1. HTablePool will be removed. It is not longer needed.
>>> >> > 2. All constructors in HTable will be removed and changed to be
>>> >> protected.
>>> >> > All code use HTableInterface only.
>>> >> > 3. HConnectionManager.getConnection() will be removed.
>>> >> > 3. All HConnection caching (deleteConnection, etc,etc) will be
>>> removed,
>>> >> as
>>> >> > it is no longer needed.
>>> >> >
>>> >> >
>>> >> > The new flow of setting up a client would look like this:
>>> >> >
>>> >> > ----- Snip -----
>>> >> > // connection to the cluster
>>> >> > HConnection conn = HConnectionManager.createConnection(conf);
>>> >> > ...
>>> >> > // When the cluster connection is established get an HTableInterface
>>> for
>>> >> > each operation or thread.
>>> >> > // HConnection.getTable(...) is lightweight. The table is really
just
>>> a
>>> >> > convenient place to call table method and for a temporary batch
cache.
>>> >> > // It is in fact less overhead than HTablePool had when retrieving
a
>>> >> > cached HTable.
>>> >> > // The HTableInterface returned is not thread safe as before.
>>> >> > // It's fine to get 1000's of these.
>>> >> > // Don't cache the longer than the lifetime of the HConnection
>>> >> > HTableInterface table = conn.getTable("MyTable");
>>> >> > ...
>>> >> > // just flushes outstanding commit, no futher cleanup needed, can
be
>>> >> > omitted.
>>> >> > // HConnection holds no references to the returned HTable objects,
>>> they
>>> >> > can be GC'd as soon as they leave scope.
>>> >> > table.close();
>>> >> > ...
>>> >> > conn.close(); // done with the cluster, release resources
>>> >> > ----- Snip -----
>>> >> >
>>> >> > The HConnection will maintain and share its own ThreadPool for
all
>>> batch
>>> >> > operations executed by the HTables.
>>> >> > This can overridden per HConnection and/or per individual HTable
>>> object.
>>> >> >
>>> >> > I will commit the new API to all branches early next week.
>>> >> >
>>> >> > Questions? Comments? Concerns? Praise?
>>> >> >
>>> >> > -- Lars
>>> >>
>>> >
>>> >
>>>
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message