hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Helmling <ghelml...@gmail.com>
Subject Re: 1 table, 1 dense CF => N tables, 1 dense CF ?
Date Fri, 09 Jan 2015 21:01:34 GMT
ScanType is a parameter of RegionObserver preCompact() and
preCompactScannerOpen().  It seems like anything we are explicitly
providing to coprocessor hooks should be LimitedPrivate.

On Fri, Jan 9, 2015 at 12:26 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> w.r.t. ScanType, here is the logic used by DefaultCompactor:
>
>         ScanType scanType =
>
>             request.isAllFiles() ? ScanType.COMPACT_DROP_DELETES :
> ScanType.
> COMPACT_RETAIN_DELETES;
>
> BTW ScanType is currently marked InterfaceAudience.Private
>
> Should it be marked LimitedPrivate ?
>
> Cheers
>
> On Fri, Jan 9, 2015 at 12:19 PM, Gary Helmling <ghelmling@gmail.com>
> wrote:
>
> > >
> > >
> > > 2) is more expensive than 1).
> > > I'm wondering if we could use Compaction Coprocessor for 2)?  HBaseHUT
> > > needs to be able to grab N rows and merge them into 1, delete those N
> > rows,
> > > and just write that 1 new row.  This N could be several thousand rows.
> > > Could Compaction Coprocessor really be used for that?
> > >
> > >
> > It would depend on the details.  If you're simply aggregating the data
> into
> > one row, and:
> > * the thousands of rows are contiguous in the scan
> > * you can somehow incrementally update or emit the new row that you want
> to
> > create so that you don't need to retain all the old rows in memory
> > * the new row you want to emit would sort sequentially into the same
> > position
> >
> > Then overriding the scanner used for compaction could be a good solution.
> > This would allow you to transform the cells emitted during compaction,
> > including dropping the cells from the old rows and emitting new
> > (transformed) cells for the new row.
> >
> >
> > > Also, would that come into play during minor or major compactions or
> > both?
> > >
> > >
> > You can distinguish between them in your coprocessor hooks based on
> > ScanType.  So up to you.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message