hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joep Rottinghuis (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-17018) Spooling BufferedMutator
Date Wed, 28 Dec 2016 02:38:58 GMT

     [ https://issues.apache.org/jira/browse/HBASE-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Joep Rottinghuis updated HBASE-17018:
    Attachment: HBASE-17018.master.005.patch

[~enis] are you suggesting we don't do a double-write, but write wals to HDFS only, and then
have a separate set of "readers" replay the WALs from HDFS to HBase?

In that case we'd be writing tons of little WAL files to the source cluster's HDFS (not just
the one backing HBase) in all cases, not just the case when HBase is bad. As Sangjin pointed
out that would introduce a delay by when the writes are available, or else we have to keep
track of high-and low watermarks, rotate WALs frequently or something else. I'm wondering
if we are just shifting the complexity around.
The nice thing with the current approach is that under normal circumstances, the data written
to HBase is ready in near-real time (only some writes are buffered, but we're talking about
flushing once a minute).
HBase writing WALs to its own HDFS will be on a separately tuned cluster.

In any case, let me discuss that approach with other devs working on timeline service and
see what they think.

In the meantime I'm stashing a new patch (version 5). This incorporate's [~sjlee0]'s suggestion
to ensure that accounting flushCount and enqueueing is done in one synchronized block so that
we avoid out of order items in the outbound queue. This is now moved to the coordinator. I've
also added a simple exception handler to the coordinator and a unit test for that.
I'm not sure how much fancier we need to get with the exception handler. 

> Spooling BufferedMutator
> ------------------------
>                 Key: HBASE-17018
>                 URL: https://issues.apache.org/jira/browse/HBASE-17018
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Joep Rottinghuis
>         Attachments: HBASE-17018.master.001.patch, HBASE-17018.master.002.patch, HBASE-17018.master.003.patch,
HBASE-17018.master.004.patch, HBASE-17018.master.005.patch, HBASE-17018SpoolingBufferedMutatorDesign-v1.pdf,
YARN-4061 HBase requirements for fault tolerant writer.pdf
> For Yarn Timeline Service v2 we use HBase as a backing store.
> A big concern we would like to address is what to do if HBase is (temporarily) down,
for example in case of an HBase upgrade.
> Most of the high volume writes will be mostly on a best-effort basis, but occasionally
we do a flush. Mainly during application lifecycle events, clients will call a flush on the
timeline service API. In order to handle the volume of writes we use a BufferedMutator. When
flush gets called on our API, we in turn call flush on the BufferedMutator.
> We would like our interface to HBase be able to spool the mutations to a filesystems
in case of HBase errors. If we use the Hadoop filesystem interface, this can then be HDFS,
gcs, s3, or any other distributed storage. The mutations can then later be re-played, for
example through a MapReduce job.
> https://reviews.apache.org/r/54882/
> For design of SpoolingBufferedMutatorImpl see https://docs.google.com/document/d/1GTSk1Hd887gGJduUr8ZJ2m-VKrIXDUv9K3dr4u2YGls/edit?usp=sharing

This message was sent by Atlassian JIRA

View raw message