hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeongdae Kim (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15871) Memstore flush doesn't finish because of backwardseek() in memstore scanner.
Date Fri, 20 May 2016 13:10:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293329#comment-15293329

Jeongdae Kim commented on HBASE-15871:

I think there is no need to create a memstore scanner at 4)'s step, because all cells with
sequence id less than or equals to scanner's readpoint are already flushed to new HFile and
there is no cells to be searched in memstore. So, by adding some extra codes in StoreScanner.selectScannersFrom()
like following, this issues could be fixed. any missing for my idea? or suggetions?

--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
@@ -403,12 +403,20 @@ public class StoreScanner extends NonReversedNonLazyKeyValueScanner
     long expiredTimestampCutoff = minVersions == 0 ? oldestUnexpiredTS :
+    long maxSequenceId = store.getMaxSequenceId();
     // include only those scan files which pass all filters
     for (KeyValueScanner kvs : allScanners) {
       boolean isFile = kvs.isFileScanner();
       if ((!isFile && filesOnly) || (isFile && memOnly)) {
+      // exclude memstore scanners, 
+      // if all cells that have sequence id greater than or equal to read point in memstore
were already flushed.
+      if (!isFile && readPt <= maxSequenceId) {
+        continue;
+      }
       if (kvs.shouldUseScanner(scan, columns, expiredTimestampCutoff)) {

> Memstore flush doesn't finish because of backwardseek() in memstore scanner.
> ----------------------------------------------------------------------------
>                 Key: HBASE-15871
>                 URL: https://issues.apache.org/jira/browse/HBASE-15871
>             Project: HBase
>          Issue Type: Bug
>          Components: Scanners
>    Affects Versions: 1.1.2
>            Reporter: Jeongdae Kim
>         Attachments: memstore_backwardSeek().PNG
> Sometimes in our production hbase cluster, it takes a long time to finish memstore flush.(
for about more than 30 minutes)
> the reason is that a memstore flusher thread calls StoreScanner.updateReaders(), waits
for acquiring a lock that store scanner holds in StoreScanner.next() and backwardseek() in
memstore scanner runs for a long time.
> I think that this condition could occur in reverse scan by the following process.
> 1) create a reversed store scanner by requesting a reverse scan.
> 2) flush a memstore in the same HStore.
> 3) puts a lot of cells in memstore and memstore is almost full.
> 4) call the reverse scanner.next() and re-create all scanners in this store because all
scanners was already closed by 2)'s flush() and backwardseek() with store's lastTop for all
new scanners.
> 5) in this status, memstore is almost full by 2) and all cells in memstore have sequenceID
greater than this scanner's readPoint because of 2)'s flush(). this condition causes searching
all cells in memstore, and seekToPreviousRow() repeatly seach cells that are already searched
if a row has one column. (described this in more detail in a attached file.)
> 6) flush a memstore again in the same HStore, and wait until 4-5) process finished, to
update store files in the same HStore after flusing.
> I searched HBase jira. and found a similar issue. (HBASE-14497) but, HBASE-14497's fix
can't solve this issue because that fix just changed recursive call to loop.

This message was sent by Atlassian JIRA

View raw message