hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Feng Honghua (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7280) TableNotFoundException thrown in peer cluster will incur endless retry for shipEdits, which in turn block following normal replication
Date Thu, 06 Dec 2012 03:58:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511076#comment-13511076
] 

Feng Honghua commented on HBASE-7280:
-------------------------------------

yes, that's what I hope for the finer-grained cluster replication. for such design by default
(without any table/cf configuration) peer receives all the edits from master cluster. Since
in real-world scenario, we may have a master cluster, and a backup cluster which need to replicate
the whole copy of the master cluster and it receives all edits, but at the same time maybe
there are some experiment/down-stream clusters which just need a certain table or even some
CF of a table from master cluster. by providing table/cf configurable peer we can enable such
scenarios. 

ReplicationSource need to parse out the peer's table/cf configuration on creation, and filter
the edits while reading the HLog files to determine which edits needs to be shipped to the
corresponding peer. Looks like no more change in peer-side (ReplicationSink), right?

Yes, my current change in ReplicationSink doesn't save the unnecessary edits to peers, but
it's enough to unblocks us. A wiser treatment should be in ReplicationSource where we can
filter out unnecessary edits before shipping out to peer cluster by checking if the table
exists at peer cluster for each edit.
                
> TableNotFoundException thrown in peer cluster will incur endless retry for shipEdits,
which in turn block following normal replication
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-7280
>                 URL: https://issues.apache.org/jira/browse/HBASE-7280
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.2
>            Reporter: Feng Honghua
>             Fix For: 0.94.4
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> in cluster replication, if the master cluster have 2 tables which have column-family
declared with replication scope = 1, and add a peer cluster which has only 1 table with the
same name as the master cluster, in the ReplicationSource (thread in master cluster) for this
peer, edits (logs) for both tables will be shipped to the peer, the peer will fail applying
the edits due to TableNotFoundException, and this exception will also be responsed to the
original shipper (ReplicationSource in master cluster), and the shipper will fall into an
endless retry for shipping the failed edits without proceeding to read the remained(newer)
log files and to ship following edits(maybe the normal, expected edit for the registered table).
the symptom looks like the TableNotFoundException incurs endless retry and blocking normal
table replication

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message