Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7D2DE200C73 for ; Wed, 10 May 2017 21:20:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 7BDBF160B9C; Wed, 10 May 2017 19:20:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9B222160B99 for ; Wed, 10 May 2017 21:20:07 +0200 (CEST) Received: (qmail 55579 invoked by uid 500); 10 May 2017 19:20:06 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 55568 invoked by uid 99); 10 May 2017 19:20:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 May 2017 19:20:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 62809CD9A4 for ; Wed, 10 May 2017 19:20:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id emb4BF7mLp2i for ; Wed, 10 May 2017 19:20:05 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 39D445FBA0 for ; Wed, 10 May 2017 19:20:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 9EB26E002A for ; Wed, 10 May 2017 19:20:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4B54F21DF5 for ; Wed, 10 May 2017 19:20:04 +0000 (UTC) Date: Wed, 10 May 2017 19:20:04 +0000 (UTC) From: "James Taylor (JIRA)" To: dev@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (PHOENIX-3811) Do not disable index on write failure by default MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 10 May 2017 19:20:08 -0000 [ https://issues.apache.org/jira/browse/PHOENIX-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005218#comment-16005218 ] James Taylor edited comment on PHOENIX-3811 at 5/10/17 7:19 PM: ---------------------------------------------------------------- High level summary of changes: turns off automatic index rebuilding by default, leaves indexes active upon a write failure, and provides a means of users replaying a commit after a write failure to ensure the index is consistent with data table. Would you have any spare cycles to review, [~tdsilva]? Here's some more detail on the changes: * Turns off the background partial index rebuild/catchup task by default for a table. The reason is that users will typically have some kind of retry strategy themselves (for example, a message queue that retries). They need this as when a commit exception occurs, some of the data rows may have been written while others will not have been (regardless of what state the index is in wrt the data table). What ever retry mechanism is in use, these retries will also get the index back in sync (see below for a new mechanism for mutable tables). * Provides a means for the client to retry a commit at the timestamp at which it was originally submitted. This is important for mutable data as otherwise the retried commits may overwrite successful commits that occurred later. This is accomplished by a) including the server timestamp at which the data rows were (or would have been) committed in CommitException and b) Adds a new connection property, {{PhoenixRuntime.REPLAY_AT_ATTRIB}}, which specifies a timestamp at which the commit will occur and tells the system to ignore later data updates (to ensure your index remains in sync with your data table). * Provides an option (the default) to keep an index active even after a write failure occurs. Many use cases are essentially down without the secondary index in place and would rather the index be behind by a few rows wrt the data table while the retries are occurring. This option is configurable globally with the {{QueryServices.INDEX_FAILURE_DISABLE_INDEX}} config property and on a table by table basis through the {{PhoenixIndexFailurePolicy.DISABLE_INDEX_ON_WRITE_FAILURE}} table descriptor property. * Provides an option to turn on the partial rebuild index task on a table-by-table basis (false by default). This option is orthogonal now to whether an index remains active or is disabled (i.e. the index can remain active *and* be partially rebuilt/caught up in the background). Note that if the existing global {{PhoenixIndexFailurePolicy.INDEX_FAILURE_HANDLING_REBUILD_ATTRIB}} config property is false, then the background thread won't run so the table property won't matter. By default, the global property is true while the table-by-table property is false to allow the user to turn the automatic rebuild on for a particular table. * Lowers the default frequency at which we look for indexes which need to be partially rebuilt from every 10 seconds to once per minute. * Fixes MutableIndexFailureIT test failures and adds more for the above new options. FYI, [~lhofhansl], [~apurtell], [~mvanwely]. was (Author: jamestaylor): High level summary of changes: turns off automatic index rebuilding by default, leaves indexes active upon a write failure, and provides a means of users replaying a commit after a write failure to ensure the index is consistent with data table. Would you have any spare cycles to review, [~tdsilva]? Here's some more detail on the changes: * Turns off the background partial index rebuild/catchup task by default for a table. The reason is that users will typically have some kind of retry strategy themselves (for example, a message queue that retries). They need this as when a commit exception occurs, some of the data rows may have been written while others will not have been (regardless of what state the index is in wrt the data table). What ever retry mechanism is in use, these retries will also get the index back in sync (see below for a new mechanism for mutable tables). * Provides a means for the client to retry a commit at the timestamp at which it was originally submitted. This is important for mutable data as otherwise the retried commits may overwrite successful commits that occurred later. This is accomplished by a) including the server timestamp at which the data rows were (or would have been) committed in CommitException and b) Adds a new connection property, {{PhoenixRuntime.REPLAY_AT_ATTRIB}}, which specifies a timestamp at which the commit will occur and tells the system to ignore later data updates (to ensure your index remains in sync with your data table). * Provides an option (the default) to keep an index active even after a write failure occurs. Many use cases are essentially down without the secondary index in place and would rather the index be behind by a few rows wrt the data table while the retries are occurring. This option is configurable globally with the {{QueryServices.INDEX_FAILURE_DISABLE_INDEX}} config property and on a table by table basis through the {{PhoenixIndexFailurePolicy.DISABLE_INDEX_ON_WRITE_FAILURE}} table descriptor property. * Provides an option to turn on the partial rebuild index task on a table-by-table basis (false by default). This option is orthogonal now to whether an index remains active or is disabled. Note that if the existing global {{PhoenixIndexFailurePolicy.INDEX_FAILURE_HANDLING_REBUILD_ATTRIB}} config property is false, then the background thread won't run so the table property won't matter. By default, the global property is true while the table-by-table property is false to allow the user to turn the automatic rebuild on for a particular table. * Lowers the default frequency at which we look for indexes which need to be partially rebuilt from every 10 seconds to once per minute. * Fixes MutableIndexFailureIT test failures and adds more for the above new options. FYI, [~lhofhansl], [~apurtell], [~mvanwely]. > Do not disable index on write failure by default > ------------------------------------------------ > > Key: PHOENIX-3811 > URL: https://issues.apache.org/jira/browse/PHOENIX-3811 > Project: Phoenix > Issue Type: Bug > Reporter: James Taylor > Assignee: James Taylor > Fix For: 4.11.0 > > Attachments: PHOENIX-3811_v1.patch, PHOENIX-3811_v2.patch, PHOENIX-3811-wip1.patch, PHOENIX-3811-wip2.patch, PHOENIX-3811-wip3.patch, PHOENIX-3811-wip4.patch, PHOENIX-3811-wip5.patch, PHOENIX-3811-wip7.patch > > > We should provide a way to configure the system so that the server takes no specific action when an index write fails. Since we always throw the write failure back to the client, the client can often deal with failures more easily than the server since they have the batch of mutations in memory. Often times, allowing access to an index that may be one batch behind the data table is better than disabling it given the negative performance that will occur while the index cannot be written to. -- This message was sent by Atlassian JIRA (v6.3.15#6346)