hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17361) Make HTable thread safe
Date Tue, 17 Jan 2017 10:03:26 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15825762#comment-15825762
] 

Yu Li commented on HBASE-17361:
-------------------------------

Thanks for the reminder [~Apache9]

For HTable, I think we still have some interface compatible issue, such as:
1. The {{setWriteBufferSize}} interface which exists from the very beginning
2. The {{setOperationTimeout}} interface has been existing for more than 6 years (since HBASE-2937)
3. The {{setRpcTimeout}} interface  was introduced by HBASE-15645 which went into all 1.0+
branches (1.0.4, 1.1.5, 1.2.2, 1.3.0, 1.4.0)
4. The {{setRead/WriteRpcTimeout}} interfaces were introduced by HBASE-15866 which also went
into branch-1 (1.4.0)

And to make HTable thread safe, we need to move all the above methods into a similar table
builder like {{AsyncTableBuilder}}, right? I guess the change should be only for 2.0 and need
to add incompatible flag in release note?

Please let me know your thoughts, and I'll prepare the patch as soon as we get a consensus.
Thanks.

> Make HTable thread safe
> -----------------------
>
>                 Key: HBASE-17361
>                 URL: https://issues.apache.org/jira/browse/HBASE-17361
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Yu Li
>            Assignee: Yu Li
>            Priority: Critical
>         Attachments: HBASE-17361.patch, HBASE-17361.patch
>
>
> Currently HTable is marked as NOT thread safe, and this JIRA target at improving this
to take better usage of the thread-safe BufferedMutator.
> Some findings/work done:
> If we try to do put to the same HTable instance in parallel, there'll be problem, since
now we have {{HTable#getBufferedMutator}} like
> {code}
>    BufferedMutator getBufferedMutator() throws IOException {
>      if (mutator == null) {
>       this.mutator = (BufferedMutatorImpl) connection.getBufferedMutator(
>           new BufferedMutatorParams(tableName)
>               .pool(pool)
>               .writeBufferSize(connConfiguration.getWriteBufferSize())
>               .maxKeyValueSize(connConfiguration.getMaxKeyValueSize())
>       );
>     }
>     mutator.setRpcTimeout(writeRpcTimeout);
>     mutator.setOperationTimeout(operationTimeout);
>     return mutator;
>   }
> {code}
> And {{HTable#flushCommits}}:
> {code}
>   void flushCommits() throws IOException {
>     if (mutator == null) {
>       // nothing to flush if there's no mutator; don't bother creating one.
>       return;
>     }
>     getBufferedMutator().flush();
>   }
> {code}
> For {{HTable#put}}
> {code}
>   public void put(final Put put) throws IOException {
>     getBufferedMutator().mutate(put);
>     flushCommits();
>   }
> {code}
> If we launch multiple threads to put in parallel, below sequence might happen because
{{HTable#getBufferedMutator}} is not thread safe:
> {noformat}
> 1. ThreadA runs to getBufferedMutator and finds mutator==null
> 2. ThreadB runs to getBufferedMutator and finds mutator==null
> 3. ThreadA initialize mutator to instanceA, then calls mutator#mutate,
> adding one put (putA) into {{writeAsyncBuffer}}
> 4. ThreadB initialize mutator to instanceB
> 5. ThreadA runs to flushCommits, now mutator is instanceB, it calls
> instanceB's flush method, putA is lost
> {noformat}
> After fixing this, we will find quite some contention on {{BufferedMutatorImpl#flush}},
so more efforts required to make HTable thread safe but with good performance meanwhile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message