hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Feng Honghua (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7280) TableNotFoundException thrown in peer cluster will incur endless retry for shipEdits, which in turn block following normal replication
Date Mon, 10 Dec 2012 07:41:21 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527752#comment-13527752
] 

Feng Honghua commented on HBASE-7280:
-------------------------------------

Thanks Jean-Daniel

But even REPLICATION_SCOPE is implemented, I don't think it's as flexible as adding per-peer
table/CF configuration. Let me know if I'm wrong in understanding how REPLICATION_SCOPE is
used as routing information: edits in master cluster will be shipped to all peer clusters
whose peer_id-s are less_than_or_equal_to the REPLICATION_SCOPE. But what if a newly added
peer want to replicate a table/CF with REPLICATION_SCOPE=A and another table/CF with REPLICATION=E,
but doesn't want table/CF with REPLICATION_SCOPE=B/C/D (A>B>C>D>E here) ? Interpreting
REPLICATION_SCOPE as bit-array and treating each bit as a peer_id has a similar problem. (At
least we need to change REPLICATION_SCOPE if the original REPLICATION_SCOPE can't satisfy
a later added peer's replication requirement)

Why REPLICATION_SCOPE isn't a rescue here is because in many cases the master cluster doesn't
know exactly which peer cluster will / want to replicate which table/CF from it when it creates
tables/CFs. On the contrast, each peer cluster knows exactly which tables/CFs to replicate
from the master cluster when it adds itself as peer to the master cluster. By introducing
table/CF list configuration when adding peer, we don't bother with figuring out in advance
which(how many) peers can replicate the table/CF when creating them in master cluster, and
we don't need to change the REPLICATION_SCOPE later on. ReplicationSourceManager just listens
on the peer ZK nodes and adds a new ReplicationSource for the new peer with configured table/CF
list, reads/filters/ships edits of the configured tables/CFs to the corresponding peer.

ReplicationSource also needs to listen on its peer ZK node for table/CF configuration change,
which in turn influence which edits to ship to the peer from then on.

Any opinion?
                
> TableNotFoundException thrown in peer cluster will incur endless retry for shipEdits,
which in turn block following normal replication
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-7280
>                 URL: https://issues.apache.org/jira/browse/HBASE-7280
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.2
>            Reporter: Feng Honghua
>             Fix For: 0.94.4
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> in cluster replication, if the master cluster have 2 tables which have column-family
declared with replication scope = 1, and add a peer cluster which has only 1 table with the
same name as the master cluster, in the ReplicationSource (thread in master cluster) for this
peer, edits (logs) for both tables will be shipped to the peer, the peer will fail applying
the edits due to TableNotFoundException, and this exception will also be responsed to the
original shipper (ReplicationSource in master cluster), and the shipper will fall into an
endless retry for shipping the failed edits without proceeding to read the remained(newer)
log files and to ship following edits(maybe the normal, expected edit for the registered table).
the symptom looks like the TableNotFoundException incurs endless retry and blocking normal
table replication

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message