hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan Rawson" <ryano...@gmail.com>
Subject Re: Review Request: Timestamp based optimization for selecting the StoreFiles to be used in a Scan
Date Wed, 07 Jul 2010 20:58:43 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/257/#review314
-----------------------------------------------------------



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java
<http://review.hbase.org/r/257/#comment1377>

    we use 1 true brace style, please move this and all others up 1 line :-)



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java
<http://review.hbase.org/r/257/#comment1378>

    spacing is like so:
    "if (this.heap != null) {"
    
    thanks!



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java
<http://review.hbase.org/r/257/#comment1379>

    I think this information should be maintained in MemStore not inside this data structure.
We might get rid of this data structure type and change to another one day. This makes it
too hard to do that.


- Ryan


On 2010-07-07 13:53:44, Pranav Khaitan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.hbase.org/r/257/
> -----------------------------------------------------------
> 
> (Updated 2010-07-07 13:53:44)
> 
> 
> Review request for hbase, Nicolas, Jonathan Gray, Karthik Ranganathan, and Kannan Muthukkaruppan.
> 
> 
> Summary
> -------
> 
> Every memstore and store file will have a minimum and maximum timestamp associated with
it. If the range of timestamps we are searching for doesn't overlap with the range for a particular
file, we can skip searching it and save time.
> 
> Would significantly improve the performance for timestamp range queries. Particularly
useful when most of the reads are for recent entries and the older files can be safely skipped.

> 
> Addresses HBASE-2265 JIRA. 
> 
> This diff includes fixing some minor bugs like KeyValueHeap used to throw an uncaught
exception when size of scanner set was zero. 
> 
> Internal review done by Jonathan and Kannan.
> 
> 
> This addresses bug HBASE-2265.
>     http://issues.apache.org/jira/browse/HBASE-2265
> 
> 
> Diffs
> -----
> 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java 959782 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java 959782

>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 959782 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 959782 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 959782 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 959782

>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 959782 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/TimeRangeTracker.java PRE-CREATION

>   trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java 960082 
>   trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 959782 
>   trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 959782

> 
> Diff: http://review.hbase.org/r/257/diff
> 
> 
> Testing
> -------
> 
> All existing JUnit tests run successfully. More JUnit tests for Memstore, StoreFile and
Store added to test correctness with multiple timestamps.
> 
> Conducted a test to measure the extra time required to keep track of min and max timestamps
while writing KeyValues.  The comparison was done by entering 1 Million KeyValues into memstore
ten times with and without timestamp tracking and then taking the average time for each of
them.  WAL was disabled and no flushing was done during this test to minimize overheads. The
average time taken for entering 1M KeyValues into memstore without keeping track of timestamp
was 13.44 seconds while the average time when keeping track of timestamps was 13.45 seconds.
This shows that no significant overhead has been added while keeping track of timestamps.
> 
> 
> Thanks,
> 
> Pranav
> 
>


Mime
View raw message