hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11654) WAL Splitting dirs are not deleted after replay.
Date Sat, 02 Aug 2014 13:32:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14083559#comment-14083559
] 

Ted Yu commented on HBASE-11654:
--------------------------------

The SplitLogManager ctor MasterFileSystem calls is the following:
{code}
  public SplitLogManager(ZooKeeperWatcher zkw, final Configuration conf,
      Stoppable stopper, MasterServices master, ServerName serverName) 
      throws InterruptedIOException, KeeperException {
    this(zkw, conf, stopper, master, serverName, new TaskFinisher() {
      @Override
      public Status finish(ServerName workerName, String logfile) {
{code}
A TaskFinisher is instantiated.

Can you elaborate why the above is not enough ?

> WAL Splitting dirs are not deleted after replay.
> ------------------------------------------------
>
>                 Key: HBASE-11654
>                 URL: https://issues.apache.org/jira/browse/HBASE-11654
>             Project: HBase
>          Issue Type: Bug
>          Components: master, wal
>    Affects Versions: 0.98.4
>            Reporter: Victor Xu
>         Attachments: HBASE-11654.patch
>
>
> I build a small cluster (20 nodes, several hundred regions) with hbase-0.98.4. And I
found some splitting directories in /hbase/WALs/ today, which is very strange because those
logs should have been replayed and deleted. Even though the ZK nodes of the dead RS had been
deleted, these splitting directories still can cause a serious trouble for cluster restart.
It resplitted and replayed all the splitting directories every time I restart my cluster,
and cost a huge amount of time. Can't imagine what could happened if it's a cluster with hundreds
of nodes and tens of thousands of regions.
> Found 56 items
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:21 /hbase/WALs/hdpdev1.cm6.tbsite.net,60020,1406714828440-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev1.cm6.tbsite.net,60020,1406716991836-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:16 /hbase/WALs/hdpdev1.cm6.tbsite.net,60020,1406778815585
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:26 /hbase/WALs/hdpdev10.cm6.tbsite.net,60020,1406526862752-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev10.cm6.tbsite.net,60020,1406716933471-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:02 /hbase/WALs/hdpdev10.cm6.tbsite.net,60020,1406778815536
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:26 /hbase/WALs/hdpdev11.cm6.tbsite.net,60020,1406526862802-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev11.cm6.tbsite.net,60020,1406716992986-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:14 /hbase/WALs/hdpdev11.cm6.tbsite.net,60020,1406778815552
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:15 /hbase/WALs/hdpdev12.cm6.tbsite.net,60020,1406526862752-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev12.cm6.tbsite.net,60020,1406716992874-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:20 /hbase/WALs/hdpdev12.cm6.tbsite.net,60020,1406778816074
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:27 /hbase/WALs/hdpdev13.cm6.tbsite.net,60020,1406526862832-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev13.cm6.tbsite.net,60020,1406716992753-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:15 /hbase/WALs/hdpdev13.cm6.tbsite.net,60020,1406857929773
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:26 /hbase/WALs/hdpdev14.cm6.tbsite.net,60020,1406526862736-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev14.cm6.tbsite.net,60020,1406716992923-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:25 /hbase/WALs/hdpdev14.cm6.tbsite.net,60020,1406778815595
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:19 /hbase/WALs/hdpdev15.cm6.tbsite.net,60020,1406526862821-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev15.cm6.tbsite.net,60020,1406716993082-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:15 /hbase/WALs/hdpdev15.cm6.tbsite.net,60020,1406778815578
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:21 /hbase/WALs/hdpdev16.cm6.tbsite.net,60020,1406526862816-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev16.cm6.tbsite.net,60020,1406716992787-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:15 /hbase/WALs/hdpdev16.cm6.tbsite.net,60020,1406778816006
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev17.cm6.tbsite.net,60020,1406716992814-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:26 /hbase/WALs/hdpdev17.cm6.tbsite.net,60020,1406778815579
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev18.cm6.tbsite.net,60020,1406716993051-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:24 /hbase/WALs/hdpdev18.cm6.tbsite.net,60020,1406778815587
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:23 /hbase/WALs/hdpdev19.cm6.tbsite.net,60020,1406526862720-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev19.cm6.tbsite.net,60020,1406716992736-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:22 /hbase/WALs/hdpdev19.cm6.tbsite.net,60020,1406865567732
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:21 /hbase/WALs/hdpdev2.cm6.tbsite.net,60020,1406778815846
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:27 /hbase/WALs/hdpdev20.cm6.tbsite.net,60020,1406714346484-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev20.cm6.tbsite.net,60020,1406716991741-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:27 /hbase/WALs/hdpdev20.cm6.tbsite.net,60020,1406778815555
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:25 /hbase/WALs/hdpdev3.cm6.tbsite.net,60020,1406714830504-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev3.cm6.tbsite.net,60020,1406716992137-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:24 /hbase/WALs/hdpdev3.cm6.tbsite.net,60020,1406778815585
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:21 /hbase/WALs/hdpdev4.cm6.tbsite.net,60020,1406714829881-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev4.cm6.tbsite.net,60020,1406716992118-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:13 /hbase/WALs/hdpdev4.cm6.tbsite.net,60020,1406864942962
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:10 /hbase/WALs/hdpdev5.cm6.tbsite.net,60020,1406526862790-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:22 /hbase/WALs/hdpdev5.cm6.tbsite.net,60020,1406715762598-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev5.cm6.tbsite.net,60020,1406716991309-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-31 15:39 /hbase/WALs/hdpdev5.cm6.tbsite.net,60020,1406778815529-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:13 /hbase/WALs/hdpdev5.cm6.tbsite.net,60020,1406941782379
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev6.cm6.tbsite.net,60020,1406716992903-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:19 /hbase/WALs/hdpdev6.cm6.tbsite.net,60020,1406778815530
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:24 /hbase/WALs/hdpdev7.cm6.tbsite.net,60020,1406526862796-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev7.cm6.tbsite.net,60020,1406716993002-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:24 /hbase/WALs/hdpdev7.cm6.tbsite.net,60020,1406778815785
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev8.cm6.tbsite.net,60020,1406716991377-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:21 /hbase/WALs/hdpdev8.cm6.tbsite.net,60020,1406778815557
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:28 /hbase/WALs/hdpdev9.cm6.tbsite.net,60020,1406716099285-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-07-30 18:43 /hbase/WALs/hdpdev9.cm6.tbsite.net,60020,1406716991336-splitting
> drwxr-xr-x   - hadoop hadoop          0 2014-08-02 19:10 /hbase/WALs/hdpdev9.cm6.tbsite.net,60020,1406778815554



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message