couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: [POC] Mango Catch All Selector
Date Mon, 04 Jan 2016 19:04:46 GMT
Hey all,

I meant to reply to the ticket on pouchdb-find but got distracted by
the holidays.

I wanted to note that the original motivation for rejecting a selector
that doesn't have an index was to avoid the specific situation where a
user has a query that appears to run quite quickly in testing/dev but
fails or results in timeouts in production due to a different data
set. This was definitely a deviation from the MongoDB approach. The
last I read their docs on this they mentioned in a couple places that
while an index is not required there are limits on result set sizes
and (I think?) query time. I made the choice that rather than fail
eventually to fail quickly and hopefully be descriptive of why the
query failed. For instance, there should be a note in the error
response when no index is available that describes which fields could
be indexed to satisfy the query.

On the other hand, once we had users actually playing with this
feature there were quite a few instances of, "I just want to try this
query without waiting for an index to build." and I made the clever
suggestion that just adding the {"$and": [Query, {"_id": {"$gt":
null}}]} wrapper would cause a full table scan. That's obviously a
hack and I was fine with that because it seemed like an obvious hack
that would motivate users to create the appropriate index before
moving to production.

On the flip side it seems like for some people the hack is a hurdle
into learning the query capabilities as well as adding to the overhead
of learning CouchDB in general. And this particular feature was aimed
directly at providing an easier on-ramp to CouchDB for people coming
from other databases. Given what I've read here and elsewhere perhaps
what might be easiest would be to add a feature along the lines of
"developing": "true" to the _find request body that would enable the
_all_docs fold. This would provide two benefits in that internally we
could throw different errors in specific cases. For instances, some
selectors fail because they can't run against a map/reduce index (ie,
$or) and that won't change no matter what map/reduce indexes are
added. If we just wrap the the _all_docs hack this changes the
behavior which would probably surprise new users.

On the other hand, indexes can be operationally quite costly and
require planning to handle capacity so I would definitely avoid
automatically creating them from the _find endpoint. Perhaps we could
add a feature for the _index endpoint that accepts a selector and
figures out the index to create. Which I think is along the lines of
what Dale mentioned but with a slightly more on purpose interaction
from the user.

Paul

On Mon, Jan 4, 2016 at 8:05 AM, Garren Smith <garren@apache.org> wrote:
> Hi Robert,
>
> This is cool. I think it links in with this https://issues.apache.org/jira/browse/COUCHDB-2928
<https://issues.apache.org/jira/browse/COUCHDB-2928> and this https://github.com/nolanlawson/pouchdb-find/issues/138
<https://github.com/nolanlawson/pouchdb-find/issues/138>
>
> Cheers
> Garren
>
>> On 04 Jan 2016, at 2:33 PM, Dale Harvey <dale@arandomurl.com> wrote:
>>
>> I havent yet started looking into the implementation details, but when
>> using pouchdb-find I have very much always expected that at some point we
>> would analyse the queries and automatically produce an index for them. This
>> seems like a great step in between.
>>
>> On 4 January 2016 at 13:27, Robert Kowalski <rok@kowalski.gd> wrote:
>>
>>> Hi list,
>>>
>>> I hope you had awesome holidays!
>>>
>>> The whole holidays I thought about an idea I had and today I
>>> implemented a prototype which still has some bugs and isn't complete
>>> yet.
>>>
>>> I want to find out if there is general interest and if it would be
>>> worth to spend more time.
>>>
>>> The problem I am trying to solve is that I usually have a hard time
>>> explaining people how views work. Now we got Mango and I can just say:
>>> we use a syntax similar to MongoDB's query language _but you have to
>>> create an index before you can use it_.
>>>
>>> At this point I usually look into sad, big eyes because no one
>>> understands why they have to create an index first and I feel there is
>>> another entry barrier for newcomers. If trying anyway given they have
>>> decided for CouchDB the user gets a error back: "no index available
>>> for this selector".
>>>
>>> The idea of this patch is to just fallback on the "give me all docs
>>> and i filter afterwards"-trick that people usually use (if they know
>>> it) when they just want to test something, without creating an index
>>> which can take time for creation and requires further knowledge.
>>> Additionally the user is warned that they can create an index to make
>>> the queries faster.
>>>
>>> What do you think? Is that something worth to work on further? The PR
>>> is at https://github.com/apache/couchdb-mango/pull/27
>>>
>>> You can test it with basic queries on a database which does not have
>>> indexes for the fields you want to query created yet.
>>>
>>>
>>> Best,
>>> Robert :)
>>>
>

Mime
View raw message