hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guanghao Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19818) Scan time limit not work if the filter always filter row key
Date Wed, 24 Jan 2018 08:28:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16337123#comment-16337123
] 

Guanghao Zhang commented on HBASE-19818:
----------------------------------------

Open HBASE-19855 to refactor RegionScannerImpl.nextInternal method.

> Scan time limit not work if the filter always filter row key
> ------------------------------------------------------------
>
>                 Key: HBASE-19818
>                 URL: https://issues.apache.org/jira/browse/HBASE-19818
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.0.0-beta-2
>            Reporter: Guanghao Zhang
>            Assignee: Guanghao Zhang
>            Priority: Major
>         Attachments: HBASE-19818.master.003.patch
>
>
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java]
> nextInternal() method.
> {code:java}
> // Check if rowkey filter wants to exclude this row. If so, loop to next.
>  // Technically, if we hit limits before on this row, we don't need this call.
>  if (filterRowKey(current)) {
>  incrementCountOfRowsFilteredMetric(scannerContext);
>  // early check, see HBASE-16296
>  if (isFilterDoneInternal()) {
>  return scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues();
>  }
>  // Typically the count of rows scanned is incremented inside #populateResult. However,
>  // here we are filtering a row based purely on its row key, preventing us from calling
>  // #populateResult. Thus, perform the necessary increment here to rows scanned metric
>  incrementCountOfRowsScannedMetric(scannerContext);
>  boolean moreRows = nextRow(scannerContext, current);
>  if (!moreRows) {
>  return scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues();
>  }
>  results.clear();
>  continue;
>  }
> // Ok, we are good, let's try to get some results from the main heap.
>  populateResult(results, this.storeHeap, scannerContext, current);
>  if (scannerContext.checkAnyLimitReached(LimitScope.BETWEEN_CELLS)) {
>  if (hasFilterRow) {
>  throw new IncompatibleFilterException(
>  "Filter whose hasFilterRow() returns true is incompatible with scans that must "
>  + " stop mid-row because of a limit. ScannerContext:" + scannerContext);
>  }
>  return true;
>  }
> {code}
> If filterRowKey always return ture, then it skip to checkAnyLimitReached. For batch/size
limit, it is ok to skip as we don't read anything. But for time limit, it is not right. If
the filter always filter row key, we will stuck here for a long time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message