Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id B9AF8200D0A for ; Wed, 4 Oct 2017 23:08:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id B82D31609DD; Wed, 4 Oct 2017 21:08:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0A0C61609D6 for ; Wed, 4 Oct 2017 23:08:08 +0200 (CEST) Received: (qmail 48850 invoked by uid 500); 4 Oct 2017 21:08:08 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 48839 invoked by uid 99); 4 Oct 2017 21:08:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Oct 2017 21:08:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 7428E1A250A for ; Wed, 4 Oct 2017 21:08:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id PoIUeTEwlRw1 for ; Wed, 4 Oct 2017 21:08:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 8913D5FDEC for ; Wed, 4 Oct 2017 21:08:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 8AA5CE0FA2 for ; Wed, 4 Oct 2017 21:08:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 63B6924327 for ; Wed, 4 Oct 2017 21:08:02 +0000 (UTC) Date: Wed, 4 Oct 2017 21:08:02 +0000 (UTC) From: "James Taylor (JIRA)" To: dev@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (PHOENIX-4277) Use raw scans for IndexScrutinyTool MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 04 Oct 2017 21:08:09 -0000 James Taylor created PHOENIX-4277: ------------------------------------- Summary: Use raw scans for IndexScrutinyTool Key: PHOENIX-4277 URL: https://issues.apache.org/jira/browse/PHOENIX-4277 Project: Phoenix Issue Type: Bug Reporter: James Taylor Assignee: Vincent Poon The IndexScrutinyTool relies on doing point-in-time scans to determine consistency between the index and data tables. Unfortunately, deletes to the tables cause a problem with this approach, since delete markers take effect even if they're at a later time stamp than the point-in-time at which the scan is being done (unless KEEP_DELETED_CELLS is true). The logic of this is that scans should get the same results before and after a compaction take place. Taking snapshots does not help with this since they cannot be taken at a point-in-time and the delete markers will act the same way - there's no way to guarantee that the index and data table snapshots have the same "logical" set of data. Using raw scans would allow us to see the delete markers and do the correct point-in-time filtering ourselves. We'd need to write the filters to do this correctly (see the Tephra TransactionVisibilityFilter for an implementation of this that could be adapted). We'd also need to hook this into Phoenix or potentially dip down to the HBase level to do this. Thanks for brainstorming on this with me, [~lhofhansl]. -- This message was sent by Atlassian JIRA (v6.4.14#64029)