hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19001) Remove the hooks in RegionObserver which are designed to construct a StoreScanner which is marked as IA.Private
Date Tue, 17 Oct 2017 01:04:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206869#comment-16206869

Duo Zhang commented on HBASE-19001:

OK, the problem of Tephra is for flush and compaction. There are two things, first it sets
to read all versions, second it adds a Filter.

I think the first one is not a problem for flush/compaction, we always read all versions when
flush/compaction. The flush/compaction for MOB maybe different but it is OK I think? The MOB
file works like an external storage.

For the filter, the code is
  static class IncludeInProgressFilter extends FilterBase {
    private final long visibilityUpperBound;
    private final Set<Long> invalidIds;
    private final Filter txFilter;

    public IncludeInProgressFilter(long upperBound, Collection<Long> invalids, Filter
transactionFilter) {
      this.visibilityUpperBound = upperBound;
      this.invalidIds = Sets.newHashSet(invalids);
      this.txFilter = transactionFilter;

    public ReturnCode filterKeyValue(Cell cell) throws IOException {
      // include all cells visible to in-progress transactions, except for those already marked
as invalid
      long ts = cell.getTimestamp();
      if (ts > visibilityUpperBound) {
        // include everything that could still be in-progress except invalids
        if (invalidIds.contains(ts)) {
          return ReturnCode.SKIP;
        return ReturnCode.INCLUDE;
      return txFilter.filterKeyValue(cell);

It just does filterKeyValue, so I think it is easy to change to use a wrap of InternalScanner
and do filtering on the Cell list returned by InternalScanner.next. There is a example:


  private InternalScanner wrap(InternalScanner scanner) {
    OptionalLong optExpireBefore = getExpireBefore();
    if (!optExpireBefore.isPresent()) {
      return scanner;
    long expireBefore = optExpireBefore.getAsLong();
    return new DelegatingInternalScanner(scanner) {

      public boolean next(List<Cell> result, ScannerContext scannerContext) throws IOException
        boolean moreRows = scanner.next(result, scannerContext);
        result.removeIf(c -> c.getTimestamp() < expireBefore);
        return moreRows;


> Remove the hooks in RegionObserver which are designed to construct a StoreScanner which
is marked as IA.Private
> ---------------------------------------------------------------------------------------------------------------
>                 Key: HBASE-19001
>                 URL: https://issues.apache.org/jira/browse/HBASE-19001
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Coprocessors
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0-alpha-4
>         Attachments: HBASE-19001.patch
> There are three methods here
> {code}
> KeyValueScanner preStoreScannerOpen(ObserverContext<RegionCoprocessorEnvironment>
>       Store store, Scan scan, NavigableSet<byte[]> targetCols, KeyValueScanner
s, long readPt)
>       throws IOException;
> InternalScanner preFlushScannerOpen(ObserverContext<RegionCoprocessorEnvironment>
>       Store store, List<KeyValueScanner> scanners, InternalScanner s, long readPoint)
>       throws IOException;
> InternalScanner preCompactScannerOpen(ObserverContext<RegionCoprocessorEnvironment>
>       Store store, List<? extends KeyValueScanner> scanners, ScanType scanType,
long earliestPutTs,
>       InternalScanner s, CompactionLifeCycleTracker tracker, CompactionRequest request,
>       long readPoint) throws IOException;
> {code}
> For the flush and compact ones, we've discussed many times, it is not safe to let user
inject a Filter or even implement their own InternalScanner using the store file scanners,
as our correctness highly depends on the complicated logic in SQM and StoreScanner. CP users
are expected to wrap the original InternalScanner(it is a StoreScanner anyway) in preFlush/preCompact
methods to do filtering or something else.
> For preStoreScannerOpen it even returns a KeyValueScanner which is marked as IA.Private...
This is less hurt but still, we've decided to not expose StoreScanner to CP users so here
this method is useless. CP users can use preGetOp and preScannerOpen method to modify the
Get/Scan object passed in to inject into the scan operation.

This message was sent by Atlassian JIRA

View raw message