hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: 1 table, 1 dense CF => N tables, 1 dense CF ?
Date Fri, 09 Jan 2015 20:26:09 GMT
w.r.t. ScanType, here is the logic used by DefaultCompactor:

        ScanType scanType =

            request.isAllFiles() ? ScanType.COMPACT_DROP_DELETES : ScanType.
COMPACT_RETAIN_DELETES;

BTW ScanType is currently marked InterfaceAudience.Private

Should it be marked LimitedPrivate ?

Cheers

On Fri, Jan 9, 2015 at 12:19 PM, Gary Helmling <ghelmling@gmail.com> wrote:

> >
> >
> > 2) is more expensive than 1).
> > I'm wondering if we could use Compaction Coprocessor for 2)?  HBaseHUT
> > needs to be able to grab N rows and merge them into 1, delete those N
> rows,
> > and just write that 1 new row.  This N could be several thousand rows.
> > Could Compaction Coprocessor really be used for that?
> >
> >
> It would depend on the details.  If you're simply aggregating the data into
> one row, and:
> * the thousands of rows are contiguous in the scan
> * you can somehow incrementally update or emit the new row that you want to
> create so that you don't need to retain all the old rows in memory
> * the new row you want to emit would sort sequentially into the same
> position
>
> Then overriding the scanner used for compaction could be a good solution.
> This would allow you to transform the cells emitted during compaction,
> including dropping the cells from the old rows and emitting new
> (transformed) cells for the new row.
>
>
> > Also, would that come into play during minor or major compactions or
> both?
> >
> >
> You can distinguish between them in your coprocessor hooks based on
> ScanType.  So up to you.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message