hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Feng Honghua (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7280) TableNotFoundException thrown in peer cluster will incur endless retry for shipEdits, which in turn block following normal replication
Date Thu, 06 Dec 2012 01:50:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511021#comment-13511021
] 

Feng Honghua commented on HBASE-7280:
-------------------------------------

I can understand the initiative of current design. A master cluster may have multiple tables
with REPLICATION_SCOPE=1, but not all peer clusters want to replicate all these tables, current
design prevents only replicating selective table(s). In our scenario, I expect peer cluster(sink)
can omit the edits for which the table doesn't exist in peer cluster and only apply edits
for which the table(s) exist in peer cluster(we really want to replicate). I make a minor
change in ReplicationSink.java which just omits edits for non-existing table(s) in peer cluster
and the behavior is what we want. Though this change doesn't reduce the needless network bandwidth
it's at least doesn't block the normal replication.
Seems current replication's per-cluster granularity is a bit coarse-grained for many real-world
scenarios. In my opinion adding such as table- or columnfamily- list configuration for peer
when adding peer is more reasonable.
                
> TableNotFoundException thrown in peer cluster will incur endless retry for shipEdits,
which in turn block following normal replication
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-7280
>                 URL: https://issues.apache.org/jira/browse/HBASE-7280
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.2
>            Reporter: Feng Honghua
>             Fix For: 0.94.4
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> in cluster replication, if the master cluster have 2 tables which have column-family
declared with replication scope = 1, and add a peer cluster which has only 1 table with the
same name as the master cluster, in the ReplicationSource (thread in master cluster) for this
peer, edits (logs) for both tables will be shipped to the peer, the peer will fail applying
the edits due to TableNotFoundException, and this exception will also be responsed to the
original shipper (ReplicationSource in master cluster), and the shipper will fall into an
endless retry for shipping the failed edits without proceeding to read the remained(newer)
log files and to ship following edits(maybe the normal, expected edit for the registered table).
the symptom looks like the TableNotFoundException incurs endless retry and blocking normal
table replication

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message