hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zach York (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters
Date Fri, 01 Sep 2017 00:00:02 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149812#comment-16149812

Zach York commented on HBASE-18477:

Thanks [~ajayjadhav] for taking this up while I was out!

[~ashish singhi] I would disagree that this feature would only be usable by cloud users. You
could also have multiple clusters in an on-prem solution for whatever reason (HDFS cluster,
multiple HBase clusters for the sake of Read-Replica). Yes the latency interacting with the
HDFS cluster would likely be slower than if the data was colocated with the region servers,
but that is not just a cloud problem :)

Thanks everyone for pointing out additional future work and feature suggestions!

> Umbrella JIRA for HBase Read Replica clusters
> ---------------------------------------------
>                 Key: HBASE-18477
>                 URL: https://issues.apache.org/jira/browse/HBASE-18477
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Zach York
>            Assignee: Zach York
>         Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase Read-Replica Clusters
Scope doc.pdf, HBase Read-Replica Clusters Scope doc_v2.docx
> Recently, changes (such as HBASE-17437) have unblocked HBase to run with a root directory
external to the cluster (such as in Amazon S3). This means that the data is stored outside
of the cluster and can be accessible after the cluster has been terminated. One use case that
is often asked about is pointing multiple clusters to one root directory (sharing the data)
to have read resiliency in the case of a cluster failure.
> This JIRA is an umbrella JIRA to contain all the tasks necessary to create a read-replica
HBase cluster that is pointed at the same root directory.
> This requires making the Read-Replica cluster Read-Only (no metadata operation or data
> Separating the hbase:meta table for each cluster (Otherwise HBase gets confused with
multiple clusters trying to update the meta table with their ip addresses)
> Adding refresh functionality for the meta table to ensure new metadata is picked up on
the read replica cluster.
> Adding refresh functionality for HFiles for a given table to ensure new data is picked
up on the read replica cluster.
> This can be used with any existing cluster that is backed by an external filesystem.
> Please note that this feature is still quite manual (with the potential for automation
> More information on this particular feature can be found here: https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/

This message was sent by Atlassian JIRA

View raw message