From dev-return-49208-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org Thu Mar 26 16:59:55 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 13C15180637 for ; Thu, 26 Mar 2020 17:59:54 +0100 (CET) Received: (qmail 92570 invoked by uid 500); 26 Mar 2020 16:59:54 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 92558 invoked by uid 99); 26 Mar 2020 16:59:53 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2020 16:59:53 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 470B6182B89 for ; Thu, 26 Mar 2020 16:59:53 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.201 X-Spam-Level: X-Spam-Status: No, score=-0.201 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id pCCNFxXjDjiY for ; Thu, 26 Mar 2020 16:59:52 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.208.193; helo=mail-lj1-f193.google.com; envelope-from=paul.joseph.davis@gmail.com; receiver= Received: from mail-lj1-f193.google.com (mail-lj1-f193.google.com [209.85.208.193]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id AD583BB862 for ; Thu, 26 Mar 2020 16:59:51 +0000 (UTC) Received: by mail-lj1-f193.google.com with SMTP id p10so6956650ljn.1 for ; Thu, 26 Mar 2020 09:59:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=VNVUZrsP53E2W/Kw2kb6kzkwPBKD+eS0A50+ak77p7Q=; b=HHqOy3i5hoi5vQB7sYSYaxLQuWHug23vEssubzxFDBF7dOBEhnx3PldR3c3YTEZdFX PQiPBSlnWCuXix/7Ty1OFEQSEIvHoLkWK0sPVGKYZ5WnHqym2zA+7NcuLtWhKJLxzwsP bsmoJMMpkQfJodpFy0icCY9xjeAZwCBMjRkqsohlcXjKpNaPdzhEJyHF7wzhLdAdEqdZ 7X4e9GNIpYrcP3kIp2BQS5YsIXJ0eKjWGhcU7nI/yitcv2wnSHIJRCBGxiDvvKgXcjyd mnh80dOYlbmmtyDhC9PTGqt5klKkcNu6vjrXX1CXAkfujjq4pqDz6wwe5A1eGniM8MKC ZYXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=VNVUZrsP53E2W/Kw2kb6kzkwPBKD+eS0A50+ak77p7Q=; b=udvwOpewSDq9mvCM/d6PkMPZbUb+aEf/YkFhIbZ5C1/FcIgpFA/nzKp/yqpYE5mmlC SyQ0T32sGrlZd7kRG45+MI9AfsDi23FX2WdSCTdZsEaz2uNbfS6e64FqLqKYjCAAySSj rPUyb7MpPdEQgCOqDk1NgAr7P3Bp0lWpf5quXTl66e+3Sb+oQF05XdX3GVIWYAHdpaJ0 vIomZH6P7TP6QTQk33zQ4WGF5dSgkKslckGXtPnMM0oVNObCrR1Ni62oBTJtLR13aW89 l2kWfG00M9WP2zW0wjUZ+WDVYd85x27bZSNpr6DUvdEtnNp+kIsXCZbaYydMQCxnasXN MLoA== X-Gm-Message-State: AGi0PuZswxt4Y6ALohgQT8g8jvI44gsOnkWI2ZjEGq+LY8H0Fa4/oywd IHZ9+yqXpZ3zBtHJ5etr1TpVd/e8QnhTb2hy8YDPuA== X-Google-Smtp-Source: APiQypL+wn33FjxeA+MRWrZurLKz0HibtvplT3zlf/kgkPwuFaev3swC86/dSCYRfUebPIbjWFY1Y3RVIG28v1XGIL8= X-Received: by 2002:a2e:9852:: with SMTP id e18mr5727184ljj.249.1585241989916; Thu, 26 Mar 2020 09:59:49 -0700 (PDT) MIME-Version: 1.0 References: <4ebfca4c-5537-49d6-8e2c-6ea4a98ca69b@www.fastmail.com> <83da7901-6baa-f153-1bff-d60877eee403@apache.org> <518C26B2-0CD2-4FDA-84D1-EDD346A2DBFA@apache.org> In-Reply-To: From: Paul Davis Date: Thu, 26 Mar 2020 11:59:12 -0500 Message-ID: Subject: Re: [DISCUSS] Mango indexes on FDB To: dev@couchdb.apache.org Content-Type: text/plain; charset="UTF-8" On Thu, Mar 26, 2020 at 5:33 AM Will Holley wrote: > > Ah - in that case I think we should remove step 3, as it leads to a > confusing mental model. It's much simpler to explain that Mango will only > use fresh indexes and any new indexes will build in the background. > Simpler in some respect. The trade off being that we then have to teach users how to know that an index is built and also that they then need to be aware that different index types will have different ideas of what "built" means. > On Thu, 26 Mar 2020 at 10:15, Garren Smith wrote: > > > On Thu, Mar 26, 2020 at 11:04 AM Will Holley wrote: > > > > > Broadly, I think it's a big step forward if we can prevent Mango from > > > automatically selecting extremely stale indexes. > > > > > > I've been going back and forth on whether step 3 could lead to some > > > difficult-to-predict behaviour. If we assume that requests have a short > > > timeout - e.g. we can't return any result if it doesn't complete within > > the > > > FDB transaction timeout - then I think it's fine: queries that use > > > _all_docs and a large database will be timing out anyway. > > > > > > If we were to allow long-running queries then it seems a bit sketchier > > > because adding an index to a large database could cause queries that > > > previously completed to start timing out whilst they block on the index > > > build. This is basically how Mango in CouchDB 2/3 behaves and has been a > > > big pain point for customers I've worked with, to the point where you > > > basically need to explicitly specify which index Mango uses in all cases > > if > > > you're to avoid surprise timeouts when somebody adds a new index. > > > > > > As I understand it, we're not allowing queries to span FDB transactions > > so > > > this latter case is not something to worry about? > > > > > > We are going to allow queries to span transactions. This is already > > implemented for views and will be for mango > > > > > > > > > > Cheers, > > > > > > Will > > > > > > On Wed, 25 Mar 2020 at 19:43, Garren Smith wrote: > > > > > > > On Wed, Mar 25, 2020 at 8:35 PM Paul Davis < > > paul.joseph.davis@gmail.com> > > > > wrote: > > > > > > > > > > It was therefore felt that having an immediate "Not ready" signal > > for > > > > > just _some_ calls to _find, based on the type of backing index, was a > > > bad > > > > > and confusing api. > > > > > > > > > > > > We also discussed _find calls where the user does not specify an > > > index, > > > > > and concluded that we would be free to choose between using the > > > _all_docs > > > > > index (which is always up to date but rarely the best index for a > > given > > > > > selector) or blocking to update a better but stale index. > > > > > > > > > > > > Summary-ing my summarisation; > > > > > > > > > > > > 1) if you specify an index, we'll use it even if we have to update > > > it, > > > > > no matter how long that takes. > > > > > > 2) if you don't specify an index, it's the dealers choice. The > > > details > > > > > here may change in point releases. > > > > > > > > > > > > > > > > So it seems there's still a bit of confusion on what the consensus is > > > > > here. The way that I had thought this would work is that we'd do > > > > > something like such: > > > > > > > > > > 1. If user specifies and index, use it even if we have to wait > > > > > 2. If an index is built that can be used, use it > > > > > 3. If an index is building that can be used, wait for it > > > > > 4. As a last resort use _all_docs > > > > > > > > > > Discussing with Garren on the PR he's of the opinion that we should > > > > > skip step 3 and just go directly to using _all_docs if nothing is > > > > > built. > > > > > > > > > > > > > I just want to clarify step 3. I'm ok with using an index that still > > > needs > > > > to be built as long as there is no other built index > > > > that can service the request. > > > > > > > > So the big thing for me is to always prefer a built index over a > > building > > > > index. In the situation where there is only 1 building index versus all > > > > docs I'm ok with using the building index. > > > > > > > > > > > > > > > > > > > > > My main assumption is that most cases where a user is creating an > > > > > index and then wanting to run a query with it are in the > > > > > design/exploration phase of learning the feature or designing an > > index > > > > > to use. In that scenario if we skip waiting it seems likely that a > > > > > user could easily be led to believe that an index creation "worked" > > > > > for their selector when in reality it was just backed by _all_docs. > > > > > > > > > > The other reason for preferring to wait for an index to finish > > > > > building is that the UI for the normal case of creating indexes is a > > > > > bit awkward. Having to run a polling loop around checking the index > > > > > status seems suboptimal in most cases. > > > > > > > > > > Am I missing other cases that would benefit from not waiting and just > > > > > using _all_docs? > > > > > > > > > > Paul > > > > > > > > > > > > > >