hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anastasia Braginsky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18748) Cache pre-warming upon replication
Date Sun, 03 Sep 2017 12:48:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151792#comment-16151792
] 

Anastasia Braginsky commented on HBASE-18748:
---------------------------------------------

As explained in the description, we would like to add a feature to the HBase replication methodology.
The failover from primary cluster to secondary should have zero effect on the read latency.
Currently there is a spike in the read latency upon failover due to cache on the secondary
being cold. Simple redirection (duplication by user application) of reads to secondary prior
to failover, resolves this issue. However, to make secondary to proceed all the reads is some
waist of resources. Therefore, the suggestion is to redirect only "relevant" reads. In other
words, the suggested solution is to selectively replay read requests at the backup - namely,
those reads that caused cache-ins at the primary. 

We intend to use WAL replication as transport protocol (hopefully, as black box), and of course
add custom replay callbacks. Meaning, to add a new "read type" of WAL entries, that are going
to be rare, only upon cache-in. Those, read WAL entries, are going to be replicated on the
secondary cluster. Of course, the cache blocks on primary and secondary may diverse, but this
is a good heuristic.

What do you think about this suggestion? [~stack] and everybody, we would like to hear from
you! May be this is anyhow already implemented and we are not aware?

> Cache pre-warming upon replication
> ----------------------------------
>
>                 Key: HBASE-18748
>                 URL: https://issues.apache.org/jira/browse/HBASE-18748
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Anastasia Braginsky
>
> HBase's cluster replication is very important and widely used feature. Let's assume primary
cluster is replicated to secondary (backup) cluster using the WAL of the primary cluster to
propagate the changes. Let's also assume the secondary cluster is a target for failover when
needed and should become primary when needed.
> We suggest improving the way the HBase cluster failover works today. Namely, upon failover,
the backup RS's cache is cold. Warming it up to the right working set takes many minutes.
The suggested solution is to selectively replay read requests at the backup - namely, those
reads that caused cache-ins at the primary. We intend to use WAL replication as transport
protocol (hopefully, as black box), and of course add custom replay callbacks. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message