hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-5010) Filter HFiles based on TTL
Date Thu, 29 Dec 2011 03:18:33 GMT

     [ https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Phabricator updated HBASE-5010:
-------------------------------

    Attachment: D909.5.patch

mbautin updated the revision "[jira] [HBASE-5010] [89-fb] Filter HFiles based on TTL".
Reviewers: Kannan, Liyin, JIRA

  Actually, this is where I addressed Kannan's comment about compactions:

  https://reviews.facebook.net/D909?vs=2679&id=3015&whitespace=ignore-all

  (see the new line 154 in StoreScanner.java:

      scanners = selectScannersFrom(scanners);

  )

  I have spent quite a bit of time making sure that the unit test is testing the optimization
during compactions. I am now invoking compactions in two different ways, for a total of 6
parameterized instances of the test, but it is still really quick. All of our compaction codepaths
go through StoreScanner constructors, so we should have the optimization in all of them.

  All unit tests pass.

REVISION DETAIL
  https://reviews.facebook.net/D909

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
  src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
  src/main/java/org/apache/hadoop/hbase/regionserver/NonLazyKeyValueScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/TimeRangeTracker.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/main/java/org/apache/hadoop/hbase/util/Threads.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java

                
> Filter HFiles based on TTL
> --------------------------
>
>                 Key: HBASE-5010
>                 URL: https://issues.apache.org/jira/browse/HBASE-5010
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Mikhail Bautin
>            Assignee: Zhihong Yu
>         Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, D909.2.patch,
D909.3.patch, D909.4.patch, D909.5.patch
>
>
> In ScanWildcardColumnTracker we have
> {code:java}
>  
>   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
>   ...
>   private boolean isExpired(long timestamp) {
>     return timestamp < oldestStamp;
>   }
> {code}
> but this time range filtering does not participate in HFile selection. In one real case
this caused next() calls to time out because all KVs in a table got expired, but next() had
to iterate over the whole table to find that out. We should be able to filter out those HFiles
right away. I think a reasonable approach is to add a "default timerange filter" to every
scan for a CF with a finite TTL and utilize existing filtering in StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message