Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 794551882F for ; Mon, 4 Jan 2016 20:55:44 +0000 (UTC) Received: (qmail 50374 invoked by uid 500); 4 Jan 2016 20:55:43 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 50304 invoked by uid 500); 4 Jan 2016 20:55:43 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 50292 invoked by uid 99); 4 Jan 2016 20:55:43 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jan 2016 20:55:43 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id DD7C5C03B8 for ; Mon, 4 Jan 2016 20:55:42 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.151 X-Spam-Level: *** X-Spam-Status: No, score=3.151 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=3, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id baYrO0eWs_dO for ; Mon, 4 Jan 2016 20:55:30 +0000 (UTC) Received: from mail-io0-f169.google.com (mail-io0-f169.google.com [209.85.223.169]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 464EC24C11 for ; Mon, 4 Jan 2016 20:55:29 +0000 (UTC) Received: by mail-io0-f169.google.com with SMTP id o67so446728252iof.3 for ; Mon, 04 Jan 2016 12:55:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=u1V7oBGYLcgI7PE2gaW159673k20TbqqxwltAm8t578=; b=sunewalrBmXosQGRDQmFq9O4peEzT0ZsaZ4Dx83ObCu6v5V9Rrhsj0XBW1tdo+u/oT FCiAqRgDUjTQGi9zows7MR56PPIxoLZRw1KqOoSvpNLL70ovbEiym8YLuh4KlpFmfR3T sRAusyK3jtBvMbtu8/I6WTaDbr4+3GqJIh+2W59Ptx2SYMn/zyREDdZa031ky8yLbleA n1RkUSQpbsCUavm016hSmjPNbq8HVgeciLOLZn9YIdRruJCUhQPvxXggfyIra34qHTM8 tpIiGNDQgUvEBXpoB2g6K+mii2JARtWNeLe+NVW9sqFGSbnKrI8CGNFz+szsSvBKulnW UmwQ== MIME-Version: 1.0 X-Received: by 10.107.170.194 with SMTP id g63mr77309382ioj.178.1451940928200; Mon, 04 Jan 2016 12:55:28 -0800 (PST) Received: by 10.64.117.228 with HTTP; Mon, 4 Jan 2016 12:55:28 -0800 (PST) In-Reply-To: References: <0F9A7F1D-CDE3-43FC-8165-A16337802E2C@apache.org> Date: Mon, 4 Jan 2016 12:55:28 -0800 Message-ID: Subject: Re: [POC] Mango Catch All Selector From: Tony Sun To: dev@couchdb.apache.org Content-Type: multipart/alternative; boundary=001a1141d43242a55205288857e1 --001a1141d43242a55205288857e1 Content-Type: text/plain; charset=UTF-8 Hi all, Hope everyone enjoyed the holidays! This is the most common mango experience for new users: 1) Syntax issues to create an index. 2) Running into the "no index found" error because his or her query (with and w/o sort) doesn't match the index correctly. 3) We explain how views work and also suggest our all_docs hack. 4) Then the user complains that their query is slow(due to all_docs or large result set), and again we try to either optimize the index or suggest using text indexes (the new open-sourced feature). A lot of users are turned off by the usability issues encountered in 1) and 2). I agree that we should make it as easy as possible for first time users, so I am okay with removing the need to create an index first. However, we need to somehow explicitly let the user know about all_docs so they don't abuse this capability. Also, like mongo, we could internally check if the current index is an all_docs index and throw a timeout/size error for a particular query? Thanks, Tony On Mon, Jan 4, 2016 at 11:49 AM, Sebastian Rothbucher < sebastianrothbucher@googlemail.com> wrote: > Hi Robert, > > I'm with you that the easier we can make it for s/o to get started the > better. > And I think falling back to a full table scan with a log written is a good > and easy way to go. I'd even set the log level to info or even warning to > make it clear that there's a problem with huge data sets. And hopefully, > people run some load test before going into production ;-) > > The only other idea I had (a button "use default index" in Fauxton that > modifies the selector) looks daft on second thought > > - I do like your idea though > > Best > Sebastian > > > On Mon, Jan 4, 2016 at 8:04 PM, Paul Davis > wrote: > > > Hey all, > > > > I meant to reply to the ticket on pouchdb-find but got distracted by > > the holidays. > > > > I wanted to note that the original motivation for rejecting a selector > > that doesn't have an index was to avoid the specific situation where a > > user has a query that appears to run quite quickly in testing/dev but > > fails or results in timeouts in production due to a different data > > set. This was definitely a deviation from the MongoDB approach. The > > last I read their docs on this they mentioned in a couple places that > > while an index is not required there are limits on result set sizes > > and (I think?) query time. I made the choice that rather than fail > > eventually to fail quickly and hopefully be descriptive of why the > > query failed. For instance, there should be a note in the error > > response when no index is available that describes which fields could > > be indexed to satisfy the query. > > > > On the other hand, once we had users actually playing with this > > feature there were quite a few instances of, "I just want to try this > > query without waiting for an index to build." and I made the clever > > suggestion that just adding the {"$and": [Query, {"_id": {"$gt": > > null}}]} wrapper would cause a full table scan. That's obviously a > > hack and I was fine with that because it seemed like an obvious hack > > that would motivate users to create the appropriate index before > > moving to production. > > > > On the flip side it seems like for some people the hack is a hurdle > > into learning the query capabilities as well as adding to the overhead > > of learning CouchDB in general. And this particular feature was aimed > > directly at providing an easier on-ramp to CouchDB for people coming > > from other databases. Given what I've read here and elsewhere perhaps > > what might be easiest would be to add a feature along the lines of > > "developing": "true" to the _find request body that would enable the > > _all_docs fold. This would provide two benefits in that internally we > > could throw different errors in specific cases. For instances, some > > selectors fail because they can't run against a map/reduce index (ie, > > $or) and that won't change no matter what map/reduce indexes are > > added. If we just wrap the the _all_docs hack this changes the > > behavior which would probably surprise new users. > > > > On the other hand, indexes can be operationally quite costly and > > require planning to handle capacity so I would definitely avoid > > automatically creating them from the _find endpoint. Perhaps we could > > add a feature for the _index endpoint that accepts a selector and > > figures out the index to create. Which I think is along the lines of > > what Dale mentioned but with a slightly more on purpose interaction > > from the user. > > > > Paul > > > > On Mon, Jan 4, 2016 at 8:05 AM, Garren Smith wrote: > > > Hi Robert, > > > > > > This is cool. I think it links in with this > > https://issues.apache.org/jira/browse/COUCHDB-2928 < > > https://issues.apache.org/jira/browse/COUCHDB-2928> and this > > https://github.com/nolanlawson/pouchdb-find/issues/138 < > > https://github.com/nolanlawson/pouchdb-find/issues/138> > > > > > > Cheers > > > Garren > > > > > >> On 04 Jan 2016, at 2:33 PM, Dale Harvey wrote: > > >> > > >> I havent yet started looking into the implementation details, but when > > >> using pouchdb-find I have very much always expected that at some point > > we > > >> would analyse the queries and automatically produce an index for them. > > This > > >> seems like a great step in between. > > >> > > >> On 4 January 2016 at 13:27, Robert Kowalski wrote: > > >> > > >>> Hi list, > > >>> > > >>> I hope you had awesome holidays! > > >>> > > >>> The whole holidays I thought about an idea I had and today I > > >>> implemented a prototype which still has some bugs and isn't complete > > >>> yet. > > >>> > > >>> I want to find out if there is general interest and if it would be > > >>> worth to spend more time. > > >>> > > >>> The problem I am trying to solve is that I usually have a hard time > > >>> explaining people how views work. Now we got Mango and I can just > say: > > >>> we use a syntax similar to MongoDB's query language _but you have to > > >>> create an index before you can use it_. > > >>> > > >>> At this point I usually look into sad, big eyes because no one > > >>> understands why they have to create an index first and I feel there > is > > >>> another entry barrier for newcomers. If trying anyway given they have > > >>> decided for CouchDB the user gets a error back: "no index available > > >>> for this selector". > > >>> > > >>> The idea of this patch is to just fallback on the "give me all docs > > >>> and i filter afterwards"-trick that people usually use (if they know > > >>> it) when they just want to test something, without creating an index > > >>> which can take time for creation and requires further knowledge. > > >>> Additionally the user is warned that they can create an index to make > > >>> the queries faster. > > >>> > > >>> What do you think? Is that something worth to work on further? The PR > > >>> is at https://github.com/apache/couchdb-mango/pull/27 > > >>> > > >>> You can test it with basic queries on a database which does not have > > >>> indexes for the fields you want to query created yet. > > >>> > > >>> > > >>> Best, > > >>> Robert :) > > >>> > > > > > > --001a1141d43242a55205288857e1--