hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: HBase connection pool
Date Wed, 25 Feb 2015 19:01:29 GMT
Okay, looks like you're using a implicitly managed connection. It should be
fine to share a single config instance across all threads. The advantage of
HTablePool over this approach is that the number of HTables would be
managed independently from the number of Threads. This may or not be a
concern for you, based on your memory requirements, &c. In your case,
you're not specifying an ExecutorService per HTable, so the HTable
instances will be relatively light weight. Each table will manage it's own
write buffer, which can be shared by multiple threads when autoFlush is
disabled and HTablePool is used. This may or may not be desirable,
depending on your use-case.

For what it's worth, HTablePool is marked deprecated in 1.0, will likely be
removed in 2.0. To "future proof" this code, I would move to a single
shared HConnection for the whole application, and a thread-local HTable
created from/with that connection.

-n

On Wed, Feb 25, 2015 at 10:53 AM, Marcelo Valle (BLOOMBERG/ LONDON) <
mvallemilita@bloomberg.net> wrote:

> Hi Nick,
>
> I am using HBase version 0.96, I sent the link from version 0.94 because I
> haven't found the java API docs for 0.96, sorry about that.
> I have created the HTable directly from the config object, as follows:
>
> this.tlConfig = new ThreadLocal<Configuration>() {
>
> @Override
> protected Configuration initialValue() {
> return HBaseConfiguration.create();
> }
> };
> this.tlTable = new ThreadLocal<HTable>() {
> @Override
> protected HTable initialValue() {
> try {
> return new HTable(tlConfig.get(), "HBaseSerialWritesPOC");
> } catch (IOException e) {
> throw new RuntimeException(e);
> }
> }
> };
>
> I am now sure if the Configuration object should be 1 per thread as well,
> maybe I could share this one?
>
> So, just to clarify, would I get any advantage using HTablePool object
> instead of ThreadLocal<HTable> as I did?
>
> -Marcelo
>
> From: ndimiduk@gmail.com
> Subject: Re: HBase connection pool
>
> Hi Marcelo,
>
> First thing, to be clear, you're working with a 0.94 release? The reason I
> ask is we've been doing some work in this area to improve things, so
> semantics may be slightly different between 0.94, 0.98, and 1.0.
>
> How are you managing the HConnection object (or are you)? How are you
> creating your HTable instances? These will determine how the connection is
> obtained and used in relation to HTables.
>
> In general, multiple HTable instances connected to tables in the same
> cluster should be sharing the same HConnection instance. This is handled
> explicitly when you manage your own HConnection and HTables (i.e.,
> HConnection conn = ... ; HTable t = new HTable(TABLE_NAME, conn); ) It's
> handled implicitly when you construct via Configuration objects (HTable t =
> new HTable(conf, TABLE_NAME); ) This implicit option is going away in
> future versions.
>
> HTable is not safe for concurrent access because of how the write path is
> implemented (at least; there may be other portions that I'm not as familiar
> with). You should be perfectly fine to have an HTable per thread in a
> ThreadLocal.
>
> -n
>
> On Wed, Feb 25, 2015 at 9:41 AM, Marcelo Valle (BLOOMBERG/ LONDON) <
> mvallemilita@bloomberg.net> wrote:
>
>> In HBase API, does 1 HTable object means 1 connection to each region
>> server (just for 1 table)?
>>
>> The docs say (
>> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html
>> ):
>> "This class is not thread safe for reads nor write."
>>
>> I got confused, as I saw there is a HTablePool class, but it's only for a
>> table as well, can't connections be reused for more than 1 table?
>>
>> In my java application, I used ThreadLocal variables
>> (ThreadLocal<HTable>) to create an HTable variable per thread. If I do
>> several operations on each thread, I should still use the same connection,
>> right?
>>
>> []s
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message