hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray" <jg...@apache.org>
Subject Re: Review Request: Review compaction heuristic and move compaction code out so standalone and independently testable
Date Sat, 23 Oct 2010 07:07:14 GMT


> On 2010-10-22 23:57:25, stack wrote:
> > This is great.  Don't we already have a standalone compaction issues?  Should kill
it w/ this issue?  Some broad questions below.  Do you find your new selector better than
old?

Have not run the numbers against old one yet.  That's next.


> On 2010-10-22 23:57:25, stack wrote:
> > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 738
> > <http://review.cloudera.org/r/1078/diff/2/?file=15710#file15710line738>
> >
> >     Fall back on a default?

It does pull default if no config is overridden.  You think we should still fall back?  We
don't fall back to GZIP if LZO is not there and we configured it as such.


> On 2010-10-22 23:57:25, stack wrote:
> > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 723
> > <http://review.cloudera.org/r/1078/diff/2/?file=15710#file15710line723>
> >
> >     I suppose compaction is single-threaded.  Later when multi-threaded will have
to do better here.

Will load on construction of Store and will make it so can pull from HCD as well.


> On 2010-10-22 23:57:25, stack wrote:
> > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 689
> > <http://review.cloudera.org/r/1078/diff/2/?file=15710#file15710line689>
> >
> >     Would be 'nicer' if selector returned List of StoreFiles rather than indices?

Makes it much harder to test.

Not ideal but so much easier to play with new algos, all we care about (right now at least)
is a list of the sizes, ordered by age.


> On 2010-10-22 23:57:25, stack wrote:
> > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionSelectorHBase89.java,
line 1
> > <http://review.cloudera.org/r/1078/diff/2/?file=15708#file15708line1>
> >
> >     compaction stuff deserves its own subpackage now?

yes


> On 2010-10-22 23:57:25, stack wrote:
> > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionSelector.java,
line 29
> > <http://review.cloudera.org/r/1078/diff/2/?file=15707#file15707line29>
> >
> >     What is an index range?
> >     
> >     Presumption is that we compact adjacent files?  Might want to say that in javadoc?
> >

just used as part of the HBase89 implementation.  We don't actually always compact adjacent
files in the new algorithm.  I forget if we're still relying on that behavior now or not.


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1078/#review1634
-----------------------------------------------------------


On 2010-10-22 23:39:07, Jonathan Gray wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/1078/
> -----------------------------------------------------------
> 
> (Updated 2010-10-22 23:39:07)
> 
> 
> Review request for hbase, stack, Nicolas, Karthik Ranganathan, and Kannan Muthukkaruppan.
> 
> 
> Summary
> -------
> 
> Pulls compaction file selection code into new interface and makes it so it's configurable.
 Currently is globally configurable but should be easy to make it per-family setting.
> 
> Also makes the algorithm standalone and testable.
> 
> Includes a new compaction algorithm based on a new config param 'compactionForce'.  See
javadoc in compaction classes for explanation.
> 
> Big test included for new algorithm.
> 
> Also the TestCompact class includes a neat new way for us to compare compaction algorithms.
 You specify a bunch of input paramaters and then it runs a simulation and generates statistics.
 The output looks like:
> 
> 
> -----
> Ran test
> -----
> numPuts=1000000
> putSizeRange=1.0KB to 10.0KB
> numPutsPerGet=10
> flushSizeRange=64.0MB to 256.0MB
> max=10, threshold=3, force=6, factor=0.5
> -----
> 
> -----
> Final Result
> -----
> files=82.2MB, 2.9GB, 898.3MB, 1.3GB
> memstoreSize=100.8MB
> totalSize=5.1GB
> totalThroughput=18.2GB
> averageFilesPerGet=3.25622
> 
> 
> This addresses bug HBASE-2462.
>     http://issues.apache.org/jira/browse/HBASE-2462
> 
> 
> Diffs
> -----
> 
>   trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java 1026565 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionSelector.java PRE-CREATION

>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionSelectorHBase89.java
PRE-CREATION 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionSelectorWithForce.java
PRE-CREATION 
>   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1026565 
>   trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompact.java PRE-CREATION

> 
> Diff: http://review.cloudera.org/r/1078/diff
> 
> 
> Testing
> -------
> 
> TestCompact is passing.  Have not run test suite.
> 
> 
> Thanks,
> 
> Jonathan
> 
>


Mime
View raw message