Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9F869200B50 for ; Fri, 29 Jul 2016 20:46:25 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9E1FA160A79; Fri, 29 Jul 2016 18:46:25 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BBFBD160A6E for ; Fri, 29 Jul 2016 20:46:24 +0200 (CEST) Received: (qmail 29248 invoked by uid 500); 29 Jul 2016 18:46:23 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 29237 invoked by uid 99); 29 Jul 2016 18:46:23 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Jul 2016 18:46:23 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 7E7ACC88EB for ; Fri, 29 Jul 2016 18:46:23 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.507 X-Spam-Level: X-Spam-Status: No, score=-4.507 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.287] autolearn=disabled Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 3nrN7UfEwAi6 for ; Fri, 29 Jul 2016 18:46:22 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with SMTP id A38585FB68 for ; Fri, 29 Jul 2016 18:46:21 +0000 (UTC) Received: (qmail 29078 invoked by uid 99); 29 Jul 2016 18:46:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Jul 2016 18:46:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id D37DC2C0D5F for ; Fri, 29 Jul 2016 18:46:20 +0000 (UTC) Date: Fri, 29 Jul 2016 18:46:20 +0000 (UTC) From: "Samarth Jain (JIRA)" To: dev@phoenix.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (PHOENIX-3121) Queries with filter and reverse scan failing when limit is a multiple of scanner cache size MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 29 Jul 2016 18:46:25 -0000 [ https://issues.apache.org/jira/browse/PHOENIX-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399837#comment-15399837 ] Samarth Jain edited comment on PHOENIX-3121 at 7/29/16 6:46 PM: ---------------------------------------------------------------- I was slightly wrong. The cache size matters even for the case when the offset is computed on the server side. This is because hbase internally uses that cache size as the number of records to fetch within the server side scanners. And it is the regionScanner.nextRaw() call on the server side that takes a long time when the cache size = page filter limit. Having said that, the change to account for limit + offset doesn't help. {code} int scannerCacheSize = context.getStatement().getFetchSize(); Integer perScanLimit = QueryUtil.getOffsetLimit(limit, offset); if (perScanLimit != null && perScanLimit % scannerCacheSize == 0) { scan.setCaching(scannerCacheSize + 1); } {code} select * from PLATFORM_ENTITY.engagement_history_poc where who_id = '00Qx0000001S2qa' and organization_id='00Dx0000000GyYS' order by activity_date desc LIMIT 8 OFFSET 1 Record 1 Time: 15 Record 2 Time: 0 Record 3 Time: 0 Record 4 Time: 0 Record 5 Time: 44021 Record 6 Time: 1 Record 7 Time: 0 Record 8 Time: 0 The perScanLimit of 9 in this case isn't a multiple of cache size 5. So the hack isn't even getting used. Switching to LIMIT 8 OFFSET 2, the hack gets used. This time it is the record 6th that hits the reverse scan bug: Record 1 Time: 18 Record 2 Time: 0 Record 3 Time: 0 Record 4 Time: 0 Record 5 Time: 0 Record 6 Time: 44375 Record 7 Time: 0 Record 8 Time: 0 I noticed that my overall scan limit was more than the cache size. So I changed the code to do this: {code} scan.setCaching(Math.max(perScanLimit, scannerCacheSize) + 1); {code} Now, for LIMIT 8 OFFSET 2 and cache size 5 it is the fetching the first record (technically the third on server side) that hits the bug: Record 1 Time: 44300 Record 2 Time: 0 Record 3 Time: 0 Record 4 Time: 0 Record 5 Time: 0 Record 6 Time: 0 Record 7 Time: 0 Record 8 Time: 1 was (Author: samarthjain): I was slightly wrong. The cache size matters even for the case when the offset is computed on the server side. This is because hbase internally uses that cache size as the number of records to fetch from every server side scanner. Having said that, the change to account for limit + offset doesn't help. {code} int scannerCacheSize = context.getStatement().getFetchSize(); Integer perScanLimit = QueryUtil.getOffsetLimit(limit, offset); if (perScanLimit != null && perScanLimit % scannerCacheSize == 0) { scan.setCaching(scannerCacheSize + 1); } {code} select * from PLATFORM_ENTITY.engagement_history_poc where who_id = '00Qx0000001S2qa' and organization_id='00Dx0000000GyYS' order by activity_date desc LIMIT 8 OFFSET 1 Record 1 Time: 15 Record 2 Time: 0 Record 3 Time: 0 Record 4 Time: 0 Record 5 Time: 44021 Record 6 Time: 1 Record 7 Time: 0 Record 8 Time: 0 The perScanLimit of 9 in this case isn't a multiple of cache size 5. So the hack isn't even getting used. Switching to LIMIT 8 OFFSET 2, the hack gets used. This time it is the record 6th that hits the reverse scan bug: Record 1 Time: 18 Record 2 Time: 0 Record 3 Time: 0 Record 4 Time: 0 Record 5 Time: 0 Record 6 Time: 44375 Record 7 Time: 0 Record 8 Time: 0 I noticed that my overall scan limit was more than the cache size. So I changed the code to do this: {code} scan.setCaching(Math.max(perScanLimit, scannerCacheSize) + 1); {code} Now, for LIMIT 8 OFFSET 2 and cache size 5 it is the fetching the first record (technically the third on server side) that hits the bug: Record 1 Time: 44300 Record 2 Time: 0 Record 3 Time: 0 Record 4 Time: 0 Record 5 Time: 0 Record 6 Time: 0 Record 7 Time: 0 Record 8 Time: 1 > Queries with filter and reverse scan failing when limit is a multiple of scanner cache size > ------------------------------------------------------------------------------------------- > > Key: PHOENIX-3121 > URL: https://issues.apache.org/jira/browse/PHOENIX-3121 > Project: Phoenix > Issue Type: Bug > Reporter: Samarth Jain > Assignee: Samarth Jain > Fix For: 4.8.0 > > Attachments: PHOENIX-3121.patch, PHOENIX-3121_v2.patch > > > {code} > org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException: Unknown scanner '644116'. This can happen due to any of the following reasons: a) Scanner id given is wrong, b) Scanner lease expired because of long wait between consecutive client checkins, c) Server may be closing down, d) RegionServer restart during upgrade. > If the issue is due to reason (b), a possible fix would be increasing the value of'hbase.client.scanner.timeout.period' configuration. > at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3228) > at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32492) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2208) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104) > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)