hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Scan performance on a big table as combination of multiple logic tables
Date Wed, 22 Feb 2012 05:58:35 GMT
On Tue, Feb 21, 2012 at 9:29 PM, M. C. Srivas <mcsrivas@gmail.com> wrote:
> Yes,  that was my thinking ---  to do a major compaction  the region-server
> would have to load all the flushed files for that region, merge them, and
> then write out the new region. If the region-file was 20g in size, the
> region-server would require well over 20g of heap space to do this work. Am
> I completely off?

You are a little off.  We open all hfiles and then stream through each
of them doing a merge sort streaming the outputting to the new
compacted file.

Here is where we open a scanner on all the files to compact and then
as we inch through, we figure what to write to the output:

(Its a bit hard to follow whats going on -- file selection is done
already higher up in call chain).


View raw message