hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13153) Bulk Loaded HFile Replication
Date Mon, 23 Nov 2015 02:08:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021370#comment-15021370
] 

Jerry He commented on HBASE-13153:
----------------------------------

A general comment.  Disclaimer: I have not closely read thru the code, but I have read the
doc and follow thru the comments roughly.

On the source cluster: 
   The source replication handler --> sends WALs entries to the peer, including bulkload
entries, synchronously blocking for response.
On the peer cluster:
   The peer region server RPC handler --> sees bulkload WAL entry --> invokes bulkload
client RPC to another region server --> synchronously blocking
    Another region server RPC handler --> holds region write lock --> transfers files
to be bulk loaded into the region from remote cluster synchronously 

Multiple handlers on the peer cluster can potentially be blocked. Multiple regions can be
blocked from reading as well.  In the normal replication case, the granularity is a few WAL
entries.  The granularity of failure is at the entire file level with bulk load.
This is probably going to be ok in low network latency.  But what happens when the network
latency is less ideal?  In an active-active case?

> Bulk Loaded HFile Replication
> -----------------------------
>
>                 Key: HBASE-13153
>                 URL: https://issues.apache.org/jira/browse/HBASE-13153
>             Project: HBase
>          Issue Type: New Feature
>          Components: Replication
>            Reporter: sunhaitao
>            Assignee: Ashish Singhi
>             Fix For: 2.0.0
>
>         Attachments: HBASE-13153-v1.patch, HBASE-13153-v10.patch, HBASE-13153-v11.patch,
HBASE-13153-v12.patch, HBASE-13153-v13.patch, HBASE-13153-v14.patch, HBASE-13153-v15.patch,
HBASE-13153-v16.patch, HBASE-13153-v17.patch, HBASE-13153-v18.patch, HBASE-13153-v2.patch,
HBASE-13153-v3.patch, HBASE-13153-v4.patch, HBASE-13153-v5.patch, HBASE-13153-v6.patch, HBASE-13153-v7.patch,
HBASE-13153-v8.patch, HBASE-13153-v9.patch, HBASE-13153.patch, HBase Bulk Load Replication-v1-1.pdf,
HBase Bulk Load Replication-v2.pdf, HBase Bulk Load Replication-v3.pdf, HBase Bulk Load Replication.pdf,
HDFS_HA_Solution.PNG
>
>
> Currently we plan to use HBase Replication feature to deal with disaster tolerance scenario.But
we encounter an issue that we will use bulkload very frequently,because bulkload bypass write
path, and will not generate WAL, so the data will not be replicated to backup cluster. It's
inappropriate to bukload twice both on active cluster and backup cluster. So i advise do some
modification to bulkload feature to enable bukload to both active cluster and backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message