hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17018) Spooling BufferedMutator
Date Sat, 17 Dec 2016 10:20:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15756773#comment-15756773

Ted Yu commented on HBASE-17018:

For SpoolingBufferedMutatorImpl :
    this.maxKeyValueSize =
        params.getMaxKeyValueSize() != BufferedMutatorParams.UNSET
            ? params.getMaxKeyValueSize() : Integer.MAX_VALUE;
We already have hbase.client.keyvalue.maxsize
Should use the config value in place of Integer.MAX_VALUE
    if ((currentBufferSize.get() > writeBufferSize)
        && (previousBufferSize < writeBufferSize)) {
Should the first line above be:
    if ((currentBufferSize.get() >= writeBufferSize)
For close(), why is the call to flush(true) outside of the try block ?
      if (!processES.awaitTermination(operationTimeout, TimeUnit.MILLISECONDS)) {
Should we obtain timestamp (now) at the beginning of the method and wait for what remains
w.r.t. operationTimeout ?
    } catch (ExecutionException ee) {
      LOG.error("ExecutionException while waiting for shutdown.", ee);
Should some IOException be thrown in case of ExecutionException ?
    } catch (ExecutionException e) {
      // TODO Auto-generated catch block
Convert the above to LOG.error().

Both setRpcTimeout() and setOperationTimeout() call wrapped.setOperationTimeout(timeout),
do we need both methods ?

Please put patch on review board - it is getting bigger.


> Spooling BufferedMutator
> ------------------------
>                 Key: HBASE-17018
>                 URL: https://issues.apache.org/jira/browse/HBASE-17018
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Joep Rottinghuis
>         Attachments: HBASE-17018.master.001.patch, HBASE-17018.master.002.patch, HBASE-17018.master.003.patch,
HBASE-17018SpoolingBufferedMutatorDesign-v1.pdf, YARN-4061 HBase requirements for fault tolerant
> For Yarn Timeline Service v2 we use HBase as a backing store.
> A big concern we would like to address is what to do if HBase is (temporarily) down,
for example in case of an HBase upgrade.
> Most of the high volume writes will be mostly on a best-effort basis, but occasionally
we do a flush. Mainly during application lifecycle events, clients will call a flush on the
timeline service API. In order to handle the volume of writes we use a BufferedMutator. When
flush gets called on our API, we in turn call flush on the BufferedMutator.
> We would like our interface to HBase be able to spool the mutations to a filesystems
in case of HBase errors. If we use the Hadoop filesystem interface, this can then be HDFS,
gcs, s3, or any other distributed storage. The mutations can then later be re-played, for
example through a MapReduce job.

This message was sent by Atlassian JIRA

View raw message