hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4853) HBASE-4789 does overzealous pruning of seqids
Date Wed, 23 Nov 2011 22:36:40 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156356#comment-13156356
] 

Ted Yu commented on HBASE-4853:
-------------------------------

With patch v5, I got the following:
{code}
testGlobalMemStore(org.apache.hadoop.hbase.TestGlobalMemStoreSize)  Time elapsed: 11.516 sec
 <<< FAILURE!
java.lang.AssertionError: Server=10.246.204.31,62993,1322086547613, i=0 expected:<0>
but was:<608>
{code}
Here is tail of test output:
{code}
2011-11-23 14:15:55,955 INFO  [main] regionserver.Store(631): Added hdfs://localhost:62971/user/zhihyu/.META./1028785192/info/6d51d01d9498464eb025ca045e696ce4,
entries=47, sequenceid=36, filesize=8.4k
2011-11-23 14:15:55,956 INFO  [main] regionserver.HRegion(1396): Finished memstore flush of
~17.2k/17608 for region .META.,,1.1028785192 in 44ms, sequenceid=36, compaction requested=false
2011-11-23 14:15:55,956 INFO  [main] hbase.TestGlobalMemStoreSize(99): Flush .META.,,1.1028785192
on 10.246.204.31,62993,1322086547613, false, size=608
2011-11-23 14:15:55,957 INFO  [main] hbase.TestGlobalMemStoreSize(99): Flush TestGlobalMemStoreSize,,1322086555196.e2b7276e785c7f6213a5bdd08a54cf8e.
on 10.246.204.31,62993,1322086547613, false, size=608
2011-11-23 14:15:55,957 INFO  [main] hbase.TestGlobalMemStoreSize(99): Flush TestGlobalMemStoreSize,c,P\xE3+,1322086555201.2c847584e6af6e64f3bae631bd722934.
on 10.246.204.31,62993,1322086547613, false, size=608
2011-11-23 14:15:55,957 INFO  [main] hbase.TestGlobalMemStoreSize(99): Flush TestGlobalMemStoreSize,q\x83\xCC\xF1{,1322086555217.f5079469f9fa696de61b9db6364cd6e7.
on 10.246.204.31,62993,1322086547613, false, size=608
2011-11-23 14:15:55,957 INFO  [main] hbase.TestGlobalMemStoreSize(101): Post flush on 10.246.204.31,62993,1322086547613
{code}
Basically there was no mentioning of flush completion for TestGlobalMemStoreSize table.

I think we should add a log before the assertion so that we know how long we spent waiting
in the while loop:
{code}
      assertEquals("Server=" + server.getServerName() + ", i=" + i++, 0,
        server.getRegionServerAccounting().getGlobalMemstoreSize());
{code}
We should increase the wait time beyond 3 seconds.
                
> HBASE-4789 does overzealous pruning of seqids
> ---------------------------------------------
>
>                 Key: HBASE-4853
>                 URL: https://issues.apache.org/jira/browse/HBASE-4853
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>         Attachments: 4853--no-prefix.txt, 4853-trunk.txt, 4853-v4.txt, 4853-v5.txt, 4853-v6.txt,
4853.txt
>
>
> Working w/ J-D on failing replication test turned up hole in seqids made by the patch
over in hbase-4789.  With this patch in place we see lots of instances of the suspicious:
'Last sequenceid written is empty. Deleting all old hlogs'
> At a minimum, these lines need removing:
> {code}
> diff --git a/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java b/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
> index 623edbe..a0bbe01 100644
> --- a/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
> +++ b/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
> @@ -1359,11 +1359,6 @@ public class HLog implements Syncable {
>        // Cleaning up of lastSeqWritten is in the finally clause because we
>        // don't want to confuse getOldestOutstandingSeqNum()
>        this.lastSeqWritten.remove(getSnapshotName(encodedRegionName));
> -      Long l = this.lastSeqWritten.remove(encodedRegionName);
> -      if (l != null) {
> -        LOG.warn("Why is there a raw encodedRegionName in lastSeqWritten? name=" +
> -          Bytes.toString(encodedRegionName) + ", seqid=" + l);
> -       }
>        this.cacheFlushLock.unlock();
>      }
>    }
> {code}
> ... but above is no good w/o figuring why WALs are not being rotated off.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message