couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tony Sun <tony.sun...@gmail.com>
Subject Re: [POC] Mango Catch All Selector
Date Mon, 04 Jan 2016 20:55:28 GMT
Hi all,

    Hope everyone enjoyed the holidays!

    This is the most common mango experience for new users:

    1) Syntax issues to create an index.
    2) Running into the "no index found" error because his or her query
(with and w/o sort) doesn't match the index correctly.
    3) We explain how views work and also suggest our all_docs hack.
    4) Then the user complains that their query is slow(due to all_docs or
large result set), and again we try to either optimize the index or suggest
using text indexes (the new open-sourced   feature).

    A lot of users are turned off by the usability issues encountered in 1)
and 2). I agree that we should make it as easy as possible for first time
users, so I am okay with removing the need to create an index first.
However, we need to somehow explicitly let the user know about all_docs so
they don't abuse this capability. Also, like mongo, we could internally
check if the current index is an all_docs index and throw a timeout/size
error for a particular query?


Thanks,


Tony

On Mon, Jan 4, 2016 at 11:49 AM, Sebastian Rothbucher <
sebastianrothbucher@googlemail.com> wrote:

> Hi Robert,
>
> I'm with you that the easier we can make it for s/o to get started the
> better.
> And I think falling back to a full table scan with a log written is a good
> and easy way to go. I'd even set the log level to info or even warning to
> make it clear that there's a problem with huge data sets. And hopefully,
> people run some load test before going into production ;-)
>
> The only other idea I had (a button "use default index" in Fauxton that
> modifies the selector) looks daft on second thought
>
> - I do like your idea though
>
> Best
>     Sebastian
>
>
> On Mon, Jan 4, 2016 at 8:04 PM, Paul Davis <paul.joseph.davis@gmail.com>
> wrote:
>
> > Hey all,
> >
> > I meant to reply to the ticket on pouchdb-find but got distracted by
> > the holidays.
> >
> > I wanted to note that the original motivation for rejecting a selector
> > that doesn't have an index was to avoid the specific situation where a
> > user has a query that appears to run quite quickly in testing/dev but
> > fails or results in timeouts in production due to a different data
> > set. This was definitely a deviation from the MongoDB approach. The
> > last I read their docs on this they mentioned in a couple places that
> > while an index is not required there are limits on result set sizes
> > and (I think?) query time. I made the choice that rather than fail
> > eventually to fail quickly and hopefully be descriptive of why the
> > query failed. For instance, there should be a note in the error
> > response when no index is available that describes which fields could
> > be indexed to satisfy the query.
> >
> > On the other hand, once we had users actually playing with this
> > feature there were quite a few instances of, "I just want to try this
> > query without waiting for an index to build." and I made the clever
> > suggestion that just adding the {"$and": [Query, {"_id": {"$gt":
> > null}}]} wrapper would cause a full table scan. That's obviously a
> > hack and I was fine with that because it seemed like an obvious hack
> > that would motivate users to create the appropriate index before
> > moving to production.
> >
> > On the flip side it seems like for some people the hack is a hurdle
> > into learning the query capabilities as well as adding to the overhead
> > of learning CouchDB in general. And this particular feature was aimed
> > directly at providing an easier on-ramp to CouchDB for people coming
> > from other databases. Given what I've read here and elsewhere perhaps
> > what might be easiest would be to add a feature along the lines of
> > "developing": "true" to the _find request body that would enable the
> > _all_docs fold. This would provide two benefits in that internally we
> > could throw different errors in specific cases. For instances, some
> > selectors fail because they can't run against a map/reduce index (ie,
> > $or) and that won't change no matter what map/reduce indexes are
> > added. If we just wrap the the _all_docs hack this changes the
> > behavior which would probably surprise new users.
> >
> > On the other hand, indexes can be operationally quite costly and
> > require planning to handle capacity so I would definitely avoid
> > automatically creating them from the _find endpoint. Perhaps we could
> > add a feature for the _index endpoint that accepts a selector and
> > figures out the index to create. Which I think is along the lines of
> > what Dale mentioned but with a slightly more on purpose interaction
> > from the user.
> >
> > Paul
> >
> > On Mon, Jan 4, 2016 at 8:05 AM, Garren Smith <garren@apache.org> wrote:
> > > Hi Robert,
> > >
> > > This is cool. I think it links in with this
> > https://issues.apache.org/jira/browse/COUCHDB-2928 <
> > https://issues.apache.org/jira/browse/COUCHDB-2928> and this
> > https://github.com/nolanlawson/pouchdb-find/issues/138 <
> > https://github.com/nolanlawson/pouchdb-find/issues/138>
> > >
> > > Cheers
> > > Garren
> > >
> > >> On 04 Jan 2016, at 2:33 PM, Dale Harvey <dale@arandomurl.com> wrote:
> > >>
> > >> I havent yet started looking into the implementation details, but when
> > >> using pouchdb-find I have very much always expected that at some point
> > we
> > >> would analyse the queries and automatically produce an index for them.
> > This
> > >> seems like a great step in between.
> > >>
> > >> On 4 January 2016 at 13:27, Robert Kowalski <rok@kowalski.gd> wrote:
> > >>
> > >>> Hi list,
> > >>>
> > >>> I hope you had awesome holidays!
> > >>>
> > >>> The whole holidays I thought about an idea I had and today I
> > >>> implemented a prototype which still has some bugs and isn't complete
> > >>> yet.
> > >>>
> > >>> I want to find out if there is general interest and if it would be
> > >>> worth to spend more time.
> > >>>
> > >>> The problem I am trying to solve is that I usually have a hard time
> > >>> explaining people how views work. Now we got Mango and I can just
> say:
> > >>> we use a syntax similar to MongoDB's query language _but you have to
> > >>> create an index before you can use it_.
> > >>>
> > >>> At this point I usually look into sad, big eyes because no one
> > >>> understands why they have to create an index first and I feel there
> is
> > >>> another entry barrier for newcomers. If trying anyway given they have
> > >>> decided for CouchDB the user gets a error back: "no index available
> > >>> for this selector".
> > >>>
> > >>> The idea of this patch is to just fallback on the "give me all docs
> > >>> and i filter afterwards"-trick that people usually use (if they know
> > >>> it) when they just want to test something, without creating an index
> > >>> which can take time for creation and requires further knowledge.
> > >>> Additionally the user is warned that they can create an index to make
> > >>> the queries faster.
> > >>>
> > >>> What do you think? Is that something worth to work on further? The
PR
> > >>> is at https://github.com/apache/couchdb-mango/pull/27
> > >>>
> > >>> You can test it with basic queries on a database which does not have
> > >>> indexes for the fields you want to query created yet.
> > >>>
> > >>>
> > >>> Best,
> > >>> Robert :)
> > >>>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message