Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4AA97200D42 for ; Thu, 2 Nov 2017 19:20:07 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 491D61609EB; Thu, 2 Nov 2017 18:20:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8E3DF160BE6 for ; Thu, 2 Nov 2017 19:20:06 +0100 (CET) Received: (qmail 85750 invoked by uid 500); 2 Nov 2017 18:20:05 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 85531 invoked by uid 99); 2 Nov 2017 18:20:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Nov 2017 18:20:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 575E11806C0 for ; Thu, 2 Nov 2017 18:20:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id P-k_PT4FXkXT for ; Thu, 2 Nov 2017 18:20:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id D915461143 for ; Thu, 2 Nov 2017 18:20:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 6160EE0D60 for ; Thu, 2 Nov 2017 18:20:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 7970D2416F for ; Thu, 2 Nov 2017 18:20:01 +0000 (UTC) Date: Thu, 2 Nov 2017 18:20:01 +0000 (UTC) From: "James Taylor (JIRA)" To: dev@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 02 Nov 2017 18:20:07 -0000 [ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236115#comment-16236115 ] James Taylor edited comment on PHOENIX-4333 at 11/2/17 6:19 PM: ---------------------------------------------------------------- We really want to answer the question "Is there a guidepost within every region?". Whether a guidepost then intersects the scan is not the check we need. For example, you may have a query doing a skip scan which would fail the intersection test, but still have a guidepost in the region. I think if you always set the endRegionKey (instead of only when it's a local index) here before the inner loop: {code} endRegionKey = regionInfo.getEndKey(); if (isLocalIndex) { {code} and then after the inner loop, check that we set currentKeyBytes (which means we entered the loop) or that the currentGuidePost is less than the region end key, then that's enough, since we know that the currentGuidePost is already bigger than the start region key. The check for endKey == stopKey is a small optimization, since we don't need to do the key comparison again if that's not the case since we've already done it as we entered the loop (see comment below). {code} // We have a guide post in previous region if the above loop was entered // or if the current key is less than the region end key (since the loop // may not have been entered if our scan end key is smaller than the first // guide post in that region gpsAvailableForAllRegions &= currentKeyBytes != initialKeyBytes || (endKey == stopKey && (endRegionKey.length == 0 || currentGuidePost.compareTo(endRegionKey) < 0)); {code} Does this not pass all of your tests? was (Author: jamestaylor): We really want to answer the question "Is there a guidepost within every region?". Whether a guidepost then intersects the scan is not the check we need. For example, you may have a query doing a skip scan which would fail the intersection test, but still have a guidepost in the region. I think if you always set the endRegionKey (instead of only when it's a local index) here before the inner loop: {code} endRegionKey = regionInfo.getEndKey(); if (isLocalIndex) { {code} and then after the inner loop, check that we set currentKeyBytes (which means we entered the loop) or that the currentGuidePost is less than the region end key, then that's enough, since we know that the currentGuidePost is already bigger than the start region key. The check for endKey == stopKey is a small optimization, since we don't need to do the key comparison again if that's not the case since we've already done it as we entered the loop (see comment below). {code} // We have a guide post in previous region if the above loop was entered // or if the current key is less than the region end key (since the loop // may not have been entered if our scan end key is smaller than the first // guide post in that region hasGuidePostInAllRegions &= currentKeyBytes != initialKeyBytes || (endKey == stopKey && currentGuidePost.compareTo(endRegionKey) < 0;) {code} Does this not pass all of your tests? > Stats - Incorrect estimate when stats are updated on a tenant specific view > --------------------------------------------------------------------------- > > Key: PHOENIX-4333 > URL: https://issues.apache.org/jira/browse/PHOENIX-4333 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.12.0 > Reporter: Mujtaba Chohan > Assignee: Samarth Jain > Priority: Major > Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, PHOENIX-4333_v2.patch > > > Consider two tenants A, B with tenant specific view on 2 separate regions/region servers. > {noformat} > Region 1 keys: > A,1 > A,2 > B,1 > Region 2 keys: > B,2 > B,3 > {noformat} > When stats are updated on tenant A view. Querying stats on tenant B view yield partial results (only contains stats for B,1) which are incorrect even though it shows updated timestamp as current. -- This message was sent by Atlassian JIRA (v6.4.14#64029)