hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted
Date Fri, 01 Jul 2016 05:24:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358416#comment-15358416
] 

Duo Zhang commented on HBASE-16135:
-----------------------------------

I can not reproduce it locally. The strange log output is

{noformat}
2016-07-01 04:09:57,901 DEBUG [main] replication.ReplicationQueueInfo(112): Found dead servers:[hostname1.example.org,1234,1]
2016-07-01 04:09:57,910 INFO  [main] replication.TableBasedReplicationQueuesImpl(250): hostname.example.org,1234,1
has deleted abandoned queue 2-hostname1.example.org,1234,1 from hostname1.example.org,1234,1
{noformat}

It should be
{noformat}
2016-07-01 13:19:01,981 DEBUG [main] replication.ReplicationQueueInfo(112): Found dead servers:[hostname1.example.org,1234,1]
2016-07-01 13:19:01,983 INFO  [main] replication.TableBasedReplicationQueuesImpl(246): dummyserver1.example.org,1234,1
has claimed queue 1-hostname1.example.org,1234,1 from hostname1.example.org,1234,1
{noformat}

Let me dig more.

> PeerClusterZnode under rs of removed peer may never be deleted
> --------------------------------------------------------------
>
>                 Key: HBASE-16135
>                 URL: https://issues.apache.org/jira/browse/HBASE-16135
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
>         Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, HBASE-16135-branch-1.2.patch,
HBASE-16135-branch-1.patch, HBASE-16135-v1.patch, HBASE-16135-v2.patch, HBASE-16135.patch
>
>
> One of our cluster run out of space recently, and we found that the .oldlogs directory
had almost the same size as the data directory.
> Finally we found the problem is that, we removed a peer abort 3 months ago, but there
are still some replication queue znode under some rs nodes. This prevents the deletion of
.oldlogs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message