couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <>
Subject Re: [POC] Mango Catch All Selector
Date Wed, 13 Jan 2016 22:57:40 GMT

> On 13 Jan 2016, at 23:41, Joan Touzet <> wrote:
> Warning: If we start using English text in a response such as this, we'll
> need to start externalising strings and internationalising them. We've never
> had to do this before because our API is, in general, terse and relies on
> HTTP status codes to indicate when something has gone wrong.
> I think the current design constraint around text is a good one, and I'm
> unconvinced including English text is a good direction.
> If you want to take this direction, including a URL to our documentation
> instead (which *is* internationalized) is probably a better way to go,
> something like:
> .... {"_warning": "”}]

I like this improvement :)


> ----- Original Message -----
> From: "Robert Kowalski" <>
> To:
> Sent: Wednesday, January 13, 2016 2:47:27 PM
> Subject: Re: [POC] Mango Catch All Selector
> Hi Garren,
> what would selector: null do? Return all docs?
> Where in the answer from CouchDB would be the warning? Next to the
> resultset, like
> [{"_id": "foo", "_rev": "535"}, {"_warning": "slow query, use an index for
> better performance"}] ?
> Am Mittwoch, 13. Januar 2016 schrieb Garren Smith :
>> Hi Robert,
>> I think you miss understood me, I don’t want it to be a different endpoint.
>> I just don’t want a user to have to do queries like this find({slow:
>> true}). I want them to be able to do a query e.g. find({}) or
>> find({selector: null}) and then get back the results along with a warning
>> message telling them that this query would be slow in production.
>> The lower the barrier for entry here the better. I know we want to protect
>> our users for when they go to production, but forcing them to add a slow:
>> true flag won’t help. It will still require them to read the docs a lot
>> more than most people are willing to on a first attempt of something new.
>> Cheers
>> Garren
>>> On 12 Jan 2016, at 9:16 PM, Robert Kowalski <
>> <javascript:;>> wrote:
>>> thank you all for your feedback!
>>> i like the idea of the error message with a new url.
>>> i agree with garren that it should be a separate endpoint. it takes
>>> some complexity off when explaining each endpoint.
>>> maybe: `/_find_slow`?
>>> On Tue, Jan 12, 2016 at 10:36 AM, Jan Lehnardt <
>> <javascript:;>> wrote:
>>>>> On 11 Jan 2016, at 19:55, Tony Sun <
>> <javascript:;>> wrote:
>>>>> Hi Robert,
>>>>> Building upon what others have stated above, what do you think about
>>>>> the following:
>>>>> 1) Let the user query without creating an index
>>>>> 2) Return an error message with a new url that has
>>>>> "slow/no_index/developer":true appended at the end. The message clearly
>>>>> explains that this query will be slow, and that creating an index will
>> be
>>>>> more efficient. However, he or she can continue. The error message will
>>>>> then have a link to point to our documentation.
>>>>> 3) In Fauxton, there is a checkbox or button that also appends the
>>>>> "slow/no_index/developer":true to the _find url. If the user clicks it,
>>>>> then the same message pops up to notify the user.
>>>> I like this!
>>>> Jan
>>>> --
>>>>> Tony
>>>>> On Mon, Jan 11, 2016 at 9:45 AM, Eli Stevens (Gmail) <
>> <javascript:;>>
>>>>> wrote:
>>>>>> Just wanted to chime in here as a user - I've run into similar
>>>>>> behavior from CouchDB with the reduce-not-reducing-enough heuristic,
>>>>>> where stuff I was working on went smoothly in dev, but stopped once
>>>>>> real load was pushed through it (thankfully for me, that was in
>>>>>> testing, rather than released to customers).
>>>>>> It's a frustrating experience, and I don't think that a reputation
>>>>>> "works until you cross a threshold, and then it doesn't, but only
>>>>>> production" is a good thing to move towards.
>>>>>> Perhaps something like adding a key to the returned data along the
>>>>>> lines of "_slow_warning": "This query is going to be slow on large
>>>>>> data sets. See http://..." in addition to the ?slow_warning=true
>> query
>>>>>> param (note that I'm calling it "slow_warning" in both places only
>>>>>> increase discoverability; without the url param, the no-index query
>>>>>> wouldn't work at all). Bikeshed the name as needed.
>>>>>> I'd like to see a lot more URLs in CouchDB error messages in general,
>>>>>> actually - I would find it very useful when trying to determine what's
>>>>>> going wrong to have a URL right there in the logs that I can get
>>>>>> information from.
>>>>>> On Sun, Jan 10, 2016 at 11:54 AM, Joan Touzet <
>> <javascript:;>> wrote:
>>>>>>> Hi Robert,
>>>>>>> I've been thinking about this one for the week or so, and I have
>>>>>>> simple suggestion:
>>>>>>> Add the query parameter slow=true to enable this behaviour.
>>>>>>> This meets all the original requirements:
>>>>>>> 1. It is not default behaviour
>>>>>>> 2. You can grep the log files for the word 'slow' and find evidence
>>>>>>> 3. There is a shorthand, simple way to enable the behaviour
>>>>>>> 4. Any self-respecting developer will try to remove slow=true,
>>>>>>> a break, and be forced to learn about indexes
>>>>>>> 5. It's a bit cheeky, which I think is kind of fun :D
>>>>>>> All the best,
>>>>>>> Joan
>>>>>>> ----- Original Message -----
>>>>>>>> From: "William Edney" < <javascript:;>>
>>>>>>>> To: <javascript:;>
>>>>>>>> Sent: Friday, January 8, 2016 10:27:29 AM
>>>>>>>> Subject: Re: [POC] Mango Catch All Selector
>>>>>>>> Hi Robert -
>>>>>>>> As a builder of UI, API and library code who has also done
>>>>>>>> training on a variety of technologies, one simple fix might
be go
>>>>>>>> ahead and
>>>>>>>> not require indexes to be built, but then to put a big NOTE
at the
>>>>>>>> beginning of the "Mango Getting Started" guide (I would assume
>>>>>>>> is
>>>>>>>> such a piece of documentation) that states: "Note that the
>>>>>>>> in this
>>>>>>>> document do not require you to build an index, but for performance
>>>>>>>> reasons
>>>>>>>> we HIGHLY RECOMMEND that you do so. *Click here* for more
>> information
>>>>>>>> about
>>>>>>>> how to do that" (or some such verbiage).
>>>>>>>> My 2 cents.
>>>>>>>> Cheers,
>>>>>>>> - Bill
>>>>>>>> On Fri, Jan 8, 2016 at 9:04 AM, Robert Kowalski <
>> <javascript:;>>
>>>>>>>> wrote:
>>>>>>>>> Hi list,
>>>>>>>>> At the end of the mail I would like to invite the other
folks from
>>>>>>>>> the
>>>>>>>>> mailing list that build interfaces for humans (APIs,
CLIs or even
>>>>>>>>> UIs)
>>>>>>>>> to chime in again with their opinions. So all people
one the ML,
>>>>>>>>> the
>>>>>>>>> mail is not just a response to Paul, feedback is welcome
>>>>>>>>> Hi Paul, I agree with the timeout. It could lead to very
>>>>>>>>> errors which are hard to debug and support.
>>>>>>>>> I added some thoughts to the other points you made:
>>>>>>>>>> a) know that the slow queries logs exist,
>>>>>>>>> Hmm... If I take a look at the 1.x logging it was very
>>>>>>>>> straightforward. As a developer you would spin up a CouchDB
and you
>>>>>>>>> get all the log messages into your terminal. It was quite
handy in
>>>>>>>>> general for all kind of debugging. That the logs are
not displayed
>>>>>>>>> directly on stdout/stderr is in my opinion a general
2.x problem.
>>>>>>>>> The
>>>>>>>>> problem does occur with all kinds of log message we produce
>>>>>>>>> CouchDB
>>>>>>>>> for 2.x and is not specific to the slow-query-logging.
>>>>>>>>>> Ie, "You can try queries with testing:true, when
you're ready to
>>>>>>>>>> move to
>>>>>>>>> production you can
>>>>>>>>>> POST your selector to _index to create the index
which allows you
>>>>>>>>>> to
>>>>>>>>>> remove testing:true".
>>>>>>>>> I really like the migration path you mentioned here with
the API to
>>>>>>>>> create indexes. I am worried to have a too high entry
barrier for
>>>>>>>>> absolute newcomers, people that you want to play around
before they
>>>>>>>>> are ready to think about indexes, e.g. by putting coupling
>>>>>>>>> index
>>>>>>>>> topic from the beginning to the querying.
>>>>>>>>> When I throw too much things to learn on people (which
 may not
>>>>>>>>> have
>>>>>>>>> used a database before), most people get discouraged
and does not
>>>>>>>>> take
>>>>>>>>> a look. The usual things they feel or say are : "too
>>>>>>>>> "I
>>>>>>>>> have not enough time", "product XY is easier to use".
>>>>>>>>> I would argue that newcomers to a database will launch
a high
>>>>>>>>> traffic,
>>>>>>>>> multi-gigabyte product with the database from day one.
Day one is
>>>>>>>>> the
>>>>>>>>> day where they learn how to query the data and put data
into the
>>>>>>>>> database. Even for scenarios where people have a running
>>>>>>>>> traffic
>>>>>>>>> system, and have used other databases at a medium to
large scale I
>>>>>>>>> would expect given they migrate to Couch, that they run
>>>>>>>>> systems
>>>>>>>>> in parallel for the first time in order to fix the issues
>>>>>>>>> occur
>>>>>>>>> during a migration.
>>>>>>>>> I think we we share the same goal (getting beginners
>>>>>>>>> quickly)
>>>>>>>>> and the cool thing about your suggestion is that everyone
gets the
>>>>>>>>> required knowledge to run a production system right from
the very
>>>>>>>>> start. My suggestion leaves some parts out, but reduces
>>>>>>>>> cognitive
>>>>>>>>> load required to get the very first basic results, e.g.
in a
>>>>>>>>> university class setting - or junior developers on their
>>>>>>>>> friday 20% time". My big hope is, once those folks build
>>>>>>>>> traffic
>>>>>>>>> systems, they remember how easy the usage of CouchDB
was and that
>>>>>>>>> they
>>>>>>>>> start to learn more about CouchDB in order to run it
in a system
>>>>>>>>> with
>>>>>>>>> more than a few thousand documents.
>>>>>>>>> For us both I think the "what" is clear, but the "how"
is a bit
>>>>>>>>> different. I also think this discussion still makes progress,
but I
>>>>>>>>> am
>>>>>>>>> afraid it could stall. I see that we both have very good
>>>>>>>>> and
>>>>>>>>> I would like to invite the other folks from the mailing
list that
>>>>>>>>> build interfaces for humans (APIs, CLIs or even UIs)
to chime in
>>>>>>>>> again
>>>>>>>>> with their opinions - of course I'm also looking forward
to your
>>>>>>>>> answer :)
>>>>>>>>> Best,
>>>>>>>>> Robert :)
>>>>>>>>> On Wed, Jan 6, 2016 at 6:21 PM, Paul Davis
>>>>>>>>> < <javascript:;>>
>>>>>>>>> wrote:
>>>>>>>>>>>> - is a timeout solving the root cause or
the symptoms? Could it
>>>>>>>>>>>> be a
>>>>>>>>>>>> temporary or additional step as in conjunction
with query
>>>>>>>>>>>> optimisation
>>>>>>>>>>>> tooling?
>>>>>>>>>>> It really depends. From my CouchDB admin and
user perspective,
>>>>>>>>>>> this
>>>>>>>>>>> doesn't seem so important to me right now. However,
I recognize
>>>>>>>>>>> that
>>>>>>>>>>> there are different usage scenarios with different
>>>>>>>>>>> (e.g. the
>>>>>>>>>>> ones at Cloudant).
>>>>>>>>>> I don't think there's anything special about Cloudant
in this
>>>>>>>>>> discussion. Its just a question of how do we allow
new users the
>>>>>>>>>> ability to easily test and learn the selector/query
API while
>>>>>>>>>> also
>>>>>>>>>> preventing them from going too far without creating
indexes for
>>>>>>>>>> their
>>>>>>>>>> queries. The slow queries messages are fine, but
just as any
>>>>>>>>>> other
>>>>>>>>>> database they don't really prompt the developer to
make the
>>>>>>>>>> correct
>>>>>>>>>> change. Ie, the developer has to be savvy enough
to a) know that
>>>>>>>>>> the
>>>>>>>>>> slow queries logs exist, b) understand that creating
an index
>>>>>>>>>> would
>>>>>>>>>> speed things up, and then c) know which index to
create based on
>>>>>>>>>> the
>>>>>>>>>> logged query.
>>>>>>>>>> In my experience, the group of users that we're concerned
>>>>>>>>>> in
>>>>>>>>>> this discussion most likely don't know about any
of those three
>>>>>>>>>> things, hence why the current API is designed to
force them to
>>>>>>>>>> learn
>>>>>>>>>> about and understand indexes as part of learning
the API. Granted
>>>>>>>>>> the
>>>>>>>>>> `_id > null` trick muddies that learning process.
I would think
>>>>>>>>>> that
>>>>>>>>>> replacing the _id trick with `"testing": true` or
similar would
>>>>>>>>>> be an
>>>>>>>>>> obvious indication to users that this is a dev/debug
type feature
>>>>>>>>>> and
>>>>>>>>>> when they went to production they would still be
pushed to using
>>>>>>>>>> an
>>>>>>>>>> index. If we add the "create index from selector"
API then I
>>>>>>>>>> think
>>>>>>>>>> this would be a relatively straightforward method
to on ramping
>>>>>>>>>> to
>>>>>>>>>> both the query and index sides of the API. Ie, "You
can try
>>>>>>>>>> queries
>>>>>>>>>> with testing:true, when you're ready to move to production
>>>>>>>>>> can
>>>>>>>>>> POST your selector to _index to create the index
which allows you
>>>>>>>>>> to
>>>>>>>>>> remove testing:true".
>>>>>>>>>> That's also why I don't particularly care for the
>>>>>>>>>> approach.
>>>>>>>>>> It's a binary threshold that a user would (maybe)
meet after some
>>>>>>>>>> unknown amount of time after they falsely believe
their app is
>>>>>>>>>> working
>>>>>>>>>> correctly. The feedback is "Everything is fine until
it isn't".
>>>>>>>>>> Consider an app that's been working for a week or
a month or more
>>>>>>>>>> that
>>>>>>>>>> suddenly starts throwing timeouts for a query. From
the user's
>>>>>>>>>> perspective the database broke because the query
that used to
>>>>>>>>>> work
>>>>>>>>>> fine no longer does. And then there's the follow
on question on
>>>>>>>>>> how
>>>>>>>>>> that timeout might instruct the user that they need
an index, and
>>>>>>>>>> that
>>>>>>>>>> the fix may be as easy as POSTing their selector
to the _index
>>>>>>>>>> endpoint. Sure Google would most likely have the
answer if our
>>>>>>>>>> docs
>>>>>>>>>> are good enough, but by that point the developer
is probably
>>>>>>>>>> already
>>>>>>>>>> experiencing downtime if their app is live which
means they're
>>>>>>>>>> frantically trying to fix the thing. From my point
of view, a few
>>>>>>>>>> road
>>>>>>>>>> blocks that guide developers towards the correct
usage early on
>>>>>>>>>> would
>>>>>>>>>> be better than letting them get to the adrenaline
>>>>>>>>>> expletive
>>>>>>>>>> fountain of downtime.

View raw message