hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: resource usage of ResultScanner's Iterator<Result>
Date Fri, 26 Oct 2012 19:59:53 GMT
On Thu, Oct 25, 2012 at 1:24 AM, Oliver Meyn (GBIF) <omeyn@gbif.org> wrote:
> Hi all,
> I'm on cdh3u3 (hbase 0.90.4) and I need to provide a bunch of row keys based on a column
value (e.g. give me all keys where column "dataset" = 1234).  That's straightforward using
a scan and filter.  The trick is that I want to return an Iterator over my key type (Integer)
rather than expose HBase internals (i.e. Result), so I need some kind of Decorator that wraps
the Iterator<Result>.  For every call to next() I'd then call the underlying iterator's
next() and extract my Integer key from the Result.  That all works fine, but what I'm wondering
is what resources the Iterator<Result> is holding, and how I can release those from
my decorator.
> In my current implementation the decorator's constructor looks like:
> public OccurrenceKeyIterator(HTablePool tablePool, String occurrenceTableName, Scan scan)
> and the constructor builds the ResultScanner and subsequent iterator.  In my hasNext()
method I can check the underlying iterator and if it says false I can shutdown my scanner
and return the table to the TablePool. But what if the end-user never reaches the end of the
Iterator, or just dereferences it? Am I at risk of leaking tables, connections or anything
else?  Any tips on what I should do?

If the close is not called, this is what will be missed on the HTable instance:

    if (cleanupPoolOnClose) {
    if (cleanupConnectionOnClose) {
      if (this.connection != null) {
    this.closed = true;

In your case, the flushing of commits is of no import.

The pool above is an executor service inside of HTable used doing
batch calls.  Again, you don't really use it but should probably get
cleaned up.

The connection close is good because though all HTables share a
Connection, the above close updates reference counters so we know when
we can let go of the connection.

Keep a list of what you've given out and if unused in N minutes, close
it yourself in background?

Good on you Oliver (when you fellas going to upgrade?)


View raw message