hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niels Basjes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19486) Automatically flush BufferedMutator after a timeout
Date Wed, 20 Dec 2017 21:39:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16299135#comment-16299135

Niels Basjes commented on HBASE-19486:

[~stack] Ok, I understand {{autoflush}} should not be used.
Current name of the setting is {{writeBufferMaxLingerMs}}
The only thing I currently named {{autoflush}} is the timer. I realize that this is inconsistent
with the rest of this change (renaming that to {{writeBufferMaxLingerTimer}} is easy).

But before I do this: What naming for this feature would you guys prefer?
Something like {{write buffer periodic flush interval}} perhaps? 
Or is just getting rid of {{autoflush}} what you want? 

> Automatically flush BufferedMutator after a timeout 
> ----------------------------------------------------
>                 Key: HBASE-19486
>                 URL: https://issues.apache.org/jira/browse/HBASE-19486
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>         Attachments: HBASE-19486-20171212-2117.patch, HBASE-19486-20171218-1229.patch,
HBASE-19486-20171218-1300.patch, HBASE-19486-20171219-0933.patch, HBASE-19486-20171219-1026.patch,
HBASE-19486-20171219-1122-trigger-qa-run.patch, HBASE-19486-20171220-1612-trigger-qa-run.patch,
> I'm working on several projects where we are doing stream / event type processing instead
of batch type processing. We mostly use Apache Flink and Apache Beam for these projects.
> When we ingest a continuous stream of events and feed that into HBase via a BufferedMutator
this all works fine. The buffer fills up at a predictable rate and we can make sure it flushes
several times per second into HBase by tuning the buffer size.
> We also have situations where the event rate is unpredictable. Some times because the
source is in reality a batch job that puts records into Kafka, sometimes because it is the
"predictable in production" application in our testing environment (where only the dev triggers
a handful of events).
> For these kinds of use cases we need a way to 'force' the BufferedMutator to automatically
flush any records in the buffer even if the buffer is not full.
> I'll put up a pull request with a proposed implementation for review against the master
(i.e. 3.0.0).
> When approved I would like to backport this to the 1.x and 2.x versions of the client
in the same (as close as possible) way.

This message was sent by Atlassian JIRA

View raw message