hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liu Shaohui (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12865) WALs may be deleted before they are replicated to peers
Date Mon, 19 Jan 2015 10:07:35 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282333#comment-14282333

Liu Shaohui commented on HBASE-12865:

I don't think that would correct. In your scenario only the .../rs/r1 and .../rs/r3 znodes
would change. cversion is not updated up the chain. I.e. .../rs cversion is not change when
a node is added/removed from rs/r1 or rs/r2. Only r1's and r2's cversion would change.
I write a test and the result is as you said.  Thanks for pointing out my mistake.

we would need to keep re-verify cversions of all .../rs/rsXXX znodes.
It costs a lot to re-verify cversions of all .../rs/rsXXX znodes for large cluster. And this
will make more pressure to zk.

Another method is to update ../rs node during taking over the replication queue of dead server
and re-verify cversion of ../rs in the end of scan.
We can add this update operation to current atomic multi operation of coping  the replication

> WALs may be deleted before they are replicated to peers
> -------------------------------------------------------
>                 Key: HBASE-12865
>                 URL: https://issues.apache.org/jira/browse/HBASE-12865
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>            Reporter: Liu Shaohui
> By design, ReplicationLogCleaner guarantee that the WALs  being in replication queue
can't been deleted by the HMaster. The ReplicationLogCleaner gets the WAL set from zookeeper
by scanning the replication zk node. But it may get uncompleted WAL set during replication
failover for the scan operation is not atomic.
> For example: There are three region servers: rs1, rs2, rs3, and peer id 10.  The layout
of replication zookeeper nodes is:
> {code}
> /hbase/replication/rs/rs1/10/wals
>                      /rs2/10/wals
>                      /rs3/10/wals
> {code}
> - t1: the ReplicationLogCleaner finished scanning the replication queue of rs1, and start
to scan the queue of rs2.
> - t2: region server rs3 is down, and rs1 take over rs3's replication queue. The new layout
> {code}
> /hbase/replication/rs/rs1/10/wals
>                      /rs1/10-rs3/wals
>                      /rs2/10/wals
>                      /rs3
> {code}
> - t3, the ReplicationLogCleaner finished scanning the queue of rs2, and start to scan
the node of rs3. But the the queue has been moved to  "replication/rs1/10-rs3/WALS"
> So the  ReplicationLogCleaner will miss the WALs of rs3 in peer 10 and the hmaster may
delete these WALs before they are replicated to peer clusters.
> We encountered this problem in our cluster and I think it's a serious bug for replication.
> Suggestions are welcomed to fix this bug. thx~

This message was sent by Atlassian JIRA

View raw message