hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3750) HTablePool.putTable() should call tableFactory.releaseHTableInterface() for discarded table
Date Sun, 10 Apr 2011 04:12:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018021#comment-13018021
] 

jiraposter@reviews.apache.org commented on HBASE-3750:
------------------------------------------------------



bq.  On 2011-04-09 22:05:45, Gary Helmling wrote:
bq.  > What is the motivation for calling flushCommits() automatically here?  Are we trying
to save client developers from writing buggy code?
bq.  > 
bq.  > The downside I see is that it's easy to envision a case where I use HTablePool for
batch loading, but don't want to actually flush after every cycle of: get table, perform operation,
return table.  This change would prevent me from doing that and force me either write my own
pool or somehow work around it.
bq.  > 
bq.  > In this case it both fails the obviousness test for me and limits what I can easily
do as a developer.  What is the upside?  Is it sufficient to balance out the limitations?
bq.  
bq.  Ted Yu wrote:
bq.      This addition is certainly defensive.
bq.      Consider what could happen before this patch, putTable() didn't guarantee that the
table instance would be put back into the queue (because of size limit of the queue). The
user would risk losing data.
bq.      Since putTable(tableA) followed by getTable() call doesn't guarantee that tableA
would be returned, I wonder how the user planned to finally flush all the buffered data to
the underlying table. Consider, that getTable() would always return an HTableInterface instance,
he/she couldn't just call getTable() repeatedly and flush the instance's (buffered) data.
bq.  
bq.  Gary Helmling wrote:
bq.      I think the only necessary change here is your addition of the call to HTableFactory.releaseHTableInterface()
when the table instance is being discarded.  This calls HTable.close(), which of course calls
flushCommits(), so there is no risk of data loss.
bq.      
bq.      flushCommits() is also called on each pooled table instance in HTablePool.closeTablePool(),
again by way of HTable.close().
bq.      
bq.      Both of these seem appropriate.
bq.      
bq.      The only part in question, I think, is the additional call to flushCommits() on line
123.  In the case of a discarded table, this is redundant.  In the case of a table re-added
to the pool, this limits flexibility and blocks some legitimate usage (putting back a table
with a partially filled buffer).  Of course this is debatable.  There is no clear definition
of which should take priority, telling HTable to not auto flush or putting a table back in
the pool.  But I would rather err on the side of flexibility and adaptability to different
needs when the client already has the tools to flush when needed and we're already providing
flushing when the table is discarded or the pool is closed.
bq.  
bq.  Ted Yu wrote:
bq.      I modified title of the JIRA to narrow the scope.
bq.      I am Okay with removing flushCommits() call since nobody complains about loss of
data so far.

Ted, with the addition of the call to releaseHTableInterface() there is no data loss issue
here.  At least none beyond what is already possible by misusing HTable by calling setAutoFlush(false)
and then never calling flushCommits() or close().


bq.  On 2011-04-09 22:05:45, Gary Helmling wrote:
bq.  > /src/main/java/org/apache/hadoop/hbase/client/HTablePool.java, line 126
bq.  > <https://reviews.apache.org/r/573/diff/1/?file=15585#file15585line126>
bq.  >
bq.  >     This seems dangerous and unexpected from a client code point of view.  I wouldn't
expect returning the table to a pool to throw a RuntimeException that could potentially cause
my client application to exit.
bq.  
bq.  Ted Yu wrote:
bq.      Disclaimer: I didn't invent this piece of code. I got it from HTableFactory.releaseHTableInterface()
bq.      If we think that Lars' suggestion is good, we should accept this piece of code.
bq.  
bq.  Gary Helmling wrote:
bq.      I think the issue is more that this is now being called in putTable() which doesn't
declare itself to throw any exceptions.
bq.      
bq.      The solution seems pretty simple.  Make putTable() and releaseHTableInterface() throw
IOException and throw it directly instead of wrapping in RuntimeException.
bq.  
bq.  Ted Yu wrote:
bq.      I agree with this suggestion, assuming this fix goes to both trunk and 0.90.3
bq.      There're 3 methods which now throw IOException: closeTablePool(), putTable() and
releaseHTableInterface()
bq.      
bq.      Once this proposal gets more votes, I will upload a new patch.

Hmm, I didn't realize this was targeted at 0.90 as well.  Changing the API in 0.90 will not
be acceptable.  Let's leave the IOException handling in HTablePool.releaseHTableInterface()
as is for this patch and open a new JIRA to change the API for trunk to throw IOException.

Yes, for the new JIRA, all three of those methods would need to be changed to throw IOException.


- Gary


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/573/#review409
-----------------------------------------------------------


On 2011-04-09 19:48:31, Ted Yu wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/573/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-09 19:48:31)
bq.  
bq.  
bq.  Review request for hbase and Lars George.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Currently HTablePool.putTable() doesn't call table.flushCommits()
bq.  If AutoFlush is disabled for table instance, we should call table.flushCommits().
bq.  
bq.  When HTable instance is discarded in putTable(), we should call tableFactory.releaseHTableInterface().
bq.  
bq.  
bq.  This addresses bug HBASE-3750.
bq.      https://issues.apache.org/jira/browse/HBASE-3750
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    /src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 1090500 
bq.  
bq.  Diff: https://reviews.apache.org/r/573/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  TestHTablePool passes.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ted
bq.  
bq.



> HTablePool.putTable() should call tableFactory.releaseHTableInterface() for discarded
table
> -------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3750
>                 URL: https://issues.apache.org/jira/browse/HBASE-3750
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.90.1
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: 3750.txt
>
>
> Currently HTablePool.putTable() doesn't call table.flushCommits()
> When HTable instance is discarded in putTable(), we should call tableFactory.releaseHTableInterface().

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message