From dev-return-49197-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org Thu Mar 26 09:04:30 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 1AD96180637 for ; Thu, 26 Mar 2020 10:04:30 +0100 (CET) Received: (qmail 97972 invoked by uid 500); 26 Mar 2020 09:04:29 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 97960 invoked by uid 99); 26 Mar 2020 09:04:29 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2020 09:04:29 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 725BB1A327B for ; Thu, 26 Mar 2020 09:04:28 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.001 X-Spam-Level: X-Spam-Status: No, score=-0.001 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id PbHI4s_yDf7H for ; Thu, 26 Mar 2020 09:04:27 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.128.65; helo=mail-wm1-f65.google.com; envelope-from=willholley@gmail.com; receiver= Received: from mail-wm1-f65.google.com (mail-wm1-f65.google.com [209.85.128.65]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id AF1A6BB85E for ; Thu, 26 Mar 2020 09:04:26 +0000 (UTC) Received: by mail-wm1-f65.google.com with SMTP id c81so5576480wmd.4 for ; Thu, 26 Mar 2020 02:04:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=ISdgQyb08w0n2yAHTHaVAXMRn9dnDY9EnB0WYy/iggs=; b=QUFkyzubkiPYWEV0yA2xWxrUIbGEUeIJ0fKp93ADLLHjboJP6WjHdhdMPilGzQ39dA kKy++5/ToWwhWK5zIfmf43rMnVsuSysPfwxLOSyF0AFYCClm5i1cVJqhs0PyhtfYEY+L 5whvHX655gGQRzTUZCZWgpAch4BqZRmNipRcOYtF1X4xQegU7B/kEH0ER5AKCbvqtmO1 CBZUg0nLz5mO8sYORF37yJ9yOMLz+hG360CYwF3V0IgGIQBtcFoV/zay3nFZYDb9moTM YYFXnBH4g/P678mHcPEB87X6fAtqpgYQsKXhj0ttFhFgm94lOn2OKWkW1S/uPy5uQY3j g63A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=ISdgQyb08w0n2yAHTHaVAXMRn9dnDY9EnB0WYy/iggs=; b=Sy3vIYRDKSNKuLPwah0jphp7rzCqLxr/EtS8TGV980fPdr/JabAlskL44R7CExgOdj whR/be9H6HK/8ENIWLF5KN9ozoAj9Ux7PSq1WPtjxO5dk8WRoyDdeRUZdQXQGmJqZCyp Iuq+/19guPLEhysZphEYhqcqPMg5+026Kprq3ZYneGOWRghMCFZE+4p3sruaLtAyv5iR h2FdXjQMj0eTnKaYwEHkRCpMO7zOACZ5Vfz533MkIaYHeCxl7Q2UMikxpbhzp9e8K2Q1 hdwhMZm0s8gOI5Qk5n63+cZV5TTvawaD7UtF7ruPZM/U1cZNdCm/jlcwXM/ryRRsmh8z JNvA== X-Gm-Message-State: ANhLgQ3SFdDlEgGTcdBG8t1PYRDgvacJfFtzTCTENrEuxxOBGNjgYWPk kJjsHO7G64c1pEU4WCZH4V1hI97jQ/EwYZQcOWcStV8OrDw= X-Google-Smtp-Source: ADFU+vu2kCruttzrF28I4KN3i4d2TDkUiClod4fEMAY8SvLKmEITDGWGORKKdZ2C0+RrAxbgV4yy1MI/43SuZkSvt8o= X-Received: by 2002:a1c:4d7:: with SMTP id 206mr1969399wme.5.1585213465174; Thu, 26 Mar 2020 02:04:25 -0700 (PDT) MIME-Version: 1.0 References: <4ebfca4c-5537-49d6-8e2c-6ea4a98ca69b@www.fastmail.com> <83da7901-6baa-f153-1bff-d60877eee403@apache.org> <518C26B2-0CD2-4FDA-84D1-EDD346A2DBFA@apache.org> In-Reply-To: From: Will Holley Date: Thu, 26 Mar 2020 09:04:13 +0000 Message-ID: Subject: Re: [DISCUSS] Mango indexes on FDB To: dev@couchdb.apache.org Content-Type: multipart/alternative; boundary="0000000000007c265805a1be45fe" --0000000000007c265805a1be45fe Content-Type: text/plain; charset="UTF-8" Broadly, I think it's a big step forward if we can prevent Mango from automatically selecting extremely stale indexes. I've been going back and forth on whether step 3 could lead to some difficult-to-predict behaviour. If we assume that requests have a short timeout - e.g. we can't return any result if it doesn't complete within the FDB transaction timeout - then I think it's fine: queries that use _all_docs and a large database will be timing out anyway. If we were to allow long-running queries then it seems a bit sketchier because adding an index to a large database could cause queries that previously completed to start timing out whilst they block on the index build. This is basically how Mango in CouchDB 2/3 behaves and has been a big pain point for customers I've worked with, to the point where you basically need to explicitly specify which index Mango uses in all cases if you're to avoid surprise timeouts when somebody adds a new index. As I understand it, we're not allowing queries to span FDB transactions so this latter case is not something to worry about? Cheers, Will On Wed, 25 Mar 2020 at 19:43, Garren Smith wrote: > On Wed, Mar 25, 2020 at 8:35 PM Paul Davis > wrote: > > > > It was therefore felt that having an immediate "Not ready" signal for > > just _some_ calls to _find, based on the type of backing index, was a bad > > and confusing api. > > > > > > We also discussed _find calls where the user does not specify an index, > > and concluded that we would be free to choose between using the _all_docs > > index (which is always up to date but rarely the best index for a given > > selector) or blocking to update a better but stale index. > > > > > > Summary-ing my summarisation; > > > > > > 1) if you specify an index, we'll use it even if we have to update it, > > no matter how long that takes. > > > 2) if you don't specify an index, it's the dealers choice. The details > > here may change in point releases. > > > > > > > So it seems there's still a bit of confusion on what the consensus is > > here. The way that I had thought this would work is that we'd do > > something like such: > > > > 1. If user specifies and index, use it even if we have to wait > > 2. If an index is built that can be used, use it > > 3. If an index is building that can be used, wait for it > > 4. As a last resort use _all_docs > > > > Discussing with Garren on the PR he's of the opinion that we should > > skip step 3 and just go directly to using _all_docs if nothing is > > built. > > > > I just want to clarify step 3. I'm ok with using an index that still needs > to be built as long as there is no other built index > that can service the request. > > So the big thing for me is to always prefer a built index over a building > index. In the situation where there is only 1 building index versus all > docs I'm ok with using the building index. > > > > > > My main assumption is that most cases where a user is creating an > > index and then wanting to run a query with it are in the > > design/exploration phase of learning the feature or designing an index > > to use. In that scenario if we skip waiting it seems likely that a > > user could easily be led to believe that an index creation "worked" > > for their selector when in reality it was just backed by _all_docs. > > > > The other reason for preferring to wait for an index to finish > > building is that the UI for the normal case of creating indexes is a > > bit awkward. Having to run a polling loop around checking the index > > status seems suboptimal in most cases. > > > > Am I missing other cases that would benefit from not waiting and just > > using _all_docs? > > > > Paul > > > --0000000000007c265805a1be45fe--