couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: [POC] Mango Catch All Selector
Date Mon, 18 Jan 2016 11:59:49 GMT
This is awesome: +1


> On 18 Jan 2016, at 00:16, Robert Kowalski <rok@kowalski.gd> wrote:
> 
> Heya,
> 
> thanks again for all the feedback! I built a prototype and added a demo video!
> 
>> I think the current design constraint around text is a good one, and I'm
>> unconvinced including English text is a good direction.
>> 
>> If you want to take this direction, including a URL to our documentation
>> instead (which *is* internationalized) is probably a better way to go,
>> something like:
>> .... {"_warning": "http://docs.couchdb.org/en/2.0.0/.....”}]
> 
> I really like this idea! I thought long about it and I think it grows
> the scope of the current task. Right now all strings CouchDB returns
> to the user are written in English. The current message that no index
> exists is also in english. Sadly our documentation is not
> internationalised yet - afaik no language has a complete translation
> and the translations are not available as a website or in any other
> public form. I stopped translating to German myself as the promised
> integration into the doc build was never finished in ~1.5 years. For
> the specific task right now I would like to keep the scope as small as
> possible. This does not mean that I would stand in the way if folks
> want to add i18n to the project and its sub-projects and have the
> tooling and time to maintain it.
> 
> 
> Because a prototype speaks more than 1000 posts I hacked a prototype
> which includes the warning that was proposed by Garren. You can check
> it out at https://github.com/apache/couchdb-mango/pull/27 - or watch
> the video: https://cloudup.com/cEnbWqbX5Y7
> 
> What do you think?
> 
> On Wed, Jan 13, 2016 at 11:58 PM, Jan Lehnardt <jan@apache.org> wrote:
>> 
>>> On 13 Jan 2016, at 23:41, Joan Touzet <wohali@apache.org> wrote:
>>> 
>>> Warning: If we start using English text in a response such as this, we'll
>>> need to start externalising strings and internationalising them. We've never
>>> had to do this before because our API is, in general, terse and relies on
>>> HTTP status codes to indicate when something has gone wrong.
>>> 
>>> I think the current design constraint around text is a good one, and I'm
>>> unconvinced including English text is a good direction.
>>> 
>>> If you want to take this direction, including a URL to our documentation
>>> instead (which *is* internationalized) is probably a better way to go,
>>> something like:
>>> 
>>> .... {"_warning": "http://docs.couchdb.org/en/2.0.0/.....”}]
>> 
>> bikeshed: maybe slow_warning (like we use not_found on 404s), but yeah,
>> something like this!
>> 
>> Great discussion everyone. I like how we are all making this idea better together
:)
>> 
>> Best
>> Jan
>> --
>> 
>> 
>> 
>>> 
>>> 
>>> 
>>> ----- Original Message -----
>>> From: "Robert Kowalski" <rok@kowalski.gd>
>>> To: dev@couchdb.apache.org
>>> Sent: Wednesday, January 13, 2016 2:47:27 PM
>>> Subject: Re: [POC] Mango Catch All Selector
>>> 
>>> Hi Garren,
>>> 
>>> what would selector: null do? Return all docs?
>>> 
>>> Where in the answer from CouchDB would be the warning? Next to the
>>> resultset, like
>>> 
>>> [{"_id": "foo", "_rev": "535"}, {"_warning": "slow query, use an index for
>>> better performance"}] ?
>>> 
>>> Am Mittwoch, 13. Januar 2016 schrieb Garren Smith :
>>> 
>>>> Hi Robert,
>>>> 
>>>> I think you miss understood me, I don’t want it to be a different endpoint.
>>>> I just don’t want a user to have to do queries like this find({slow:
>>>> true}). I want them to be able to do a query e.g. find({}) or
>>>> find({selector: null}) and then get back the results along with a warning
>>>> message telling them that this query would be slow in production.
>>>> The lower the barrier for entry here the better. I know we want to protect
>>>> our users for when they go to production, but forcing them to add a slow:
>>>> true flag won’t help. It will still require them to read the docs a lot
>>>> more than most people are willing to on a first attempt of something new.
>>>> 
>>>> Cheers
>>>> Garren
>>>>> On 12 Jan 2016, at 9:16 PM, Robert Kowalski <rok@kowalski.gd
>>>> <javascript:;>> wrote:
>>>>> 
>>>>> thank you all for your feedback!
>>>>> 
>>>>> i like the idea of the error message with a new url.
>>>>> 
>>>>> i agree with garren that it should be a separate endpoint. it takes
>>>>> some complexity off when explaining each endpoint.
>>>>> 
>>>>> maybe: `/_find_slow`?
>>>>> 
>>>>> On Tue, Jan 12, 2016 at 10:36 AM, Jan Lehnardt <jan@apache.org
>>>> <javascript:;>> wrote:
>>>>>> 
>>>>>>> On 11 Jan 2016, at 19:55, Tony Sun <tony.sun427@gmail.com
>>>> <javascript:;>> wrote:
>>>>>>> 
>>>>>>> Hi Robert,
>>>>>>> 
>>>>>>> Building upon what others have stated above, what do you think
about
>>>>>>> the following:
>>>>>>> 
>>>>>>> 1) Let the user query without creating an index
>>>>>>> 2) Return an error message with a new url that has
>>>>>>> "slow/no_index/developer":true appended at the end. The message
clearly
>>>>>>> explains that this query will be slow, and that creating an index
will
>>>> be
>>>>>>> more efficient. However, he or she can continue. The error message
will
>>>>>>> then have a link to point to our documentation.
>>>>>>> 3) In Fauxton, there is a checkbox or button that also appends
the
>>>>>>> "slow/no_index/developer":true to the _find url. If the user
clicks it,
>>>>>>> then the same message pops up to notify the user.
>>>>>> 
>>>>>> 
>>>>>> I like this!
>>>>>> 
>>>>>> 
>>>>>> Jan
>>>>>> --
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Tony
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Jan 11, 2016 at 9:45 AM, Eli Stevens (Gmail) <
>>>> wickedgrey@gmail.com <javascript:;>>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Just wanted to chime in here as a user - I've run into similar
>>>>>>>> behavior from CouchDB with the reduce-not-reducing-enough
heuristic,
>>>>>>>> where stuff I was working on went smoothly in dev, but stopped
once
>>>>>>>> real load was pushed through it (thankfully for me, that
was in
>>>>>>>> testing, rather than released to customers).
>>>>>>>> 
>>>>>>>> It's a frustrating experience, and I don't think that a reputation
for
>>>>>>>> "works until you cross a threshold, and then it doesn't,
but only in
>>>>>>>> production" is a good thing to move towards.
>>>>>>>> 
>>>>>>>> Perhaps something like adding a key to the returned data
along the
>>>>>>>> lines of "_slow_warning": "This query is going to be slow
on large
>>>>>>>> data sets. See http://..." in addition to the ?slow_warning=true
>>>> query
>>>>>>>> param (note that I'm calling it "slow_warning" in both places
only to
>>>>>>>> increase discoverability; without the url param, the no-index
query
>>>>>>>> wouldn't work at all). Bikeshed the name as needed.
>>>>>>>> 
>>>>>>>> I'd like to see a lot more URLs in CouchDB error messages
in general,
>>>>>>>> actually - I would find it very useful when trying to determine
what's
>>>>>>>> going wrong to have a URL right there in the logs that I
can get more
>>>>>>>> information from.
>>>>>>>> 
>>>>>>>> On Sun, Jan 10, 2016 at 11:54 AM, Joan Touzet <wohali@apache.org
>>>> <javascript:;>> wrote:
>>>>>>>>> Hi Robert,
>>>>>>>>> 
>>>>>>>>> I've been thinking about this one for the week or so,
and I have a
>>>>>>>>> simple suggestion:
>>>>>>>>> 
>>>>>>>>> Add the query parameter slow=true to enable this behaviour.
>>>>>>>>> 
>>>>>>>>> This meets all the original requirements:
>>>>>>>>> 
>>>>>>>>> 1. It is not default behaviour
>>>>>>>>> 2. You can grep the log files for the word 'slow' and
find evidence
>>>>>>>>> 3. There is a shorthand, simple way to enable the behaviour
>>>>>>>>> 4. Any self-respecting developer will try to remove slow=true,
find
>>>>>>>>> a break, and be forced to learn about indexes
>>>>>>>>> 5. It's a bit cheeky, which I think is kind of fun :D
>>>>>>>>> 
>>>>>>>>> All the best,
>>>>>>>>> Joan
>>>>>>>>> 
>>>>>>>>> ----- Original Message -----
>>>>>>>>>> From: "William Edney" <bedney@technicalpursuit.com
<javascript:;>>
>>>>>>>>>> To: dev@couchdb.apache.org <javascript:;>
>>>>>>>>>> Sent: Friday, January 8, 2016 10:27:29 AM
>>>>>>>>>> Subject: Re: [POC] Mango Catch All Selector
>>>>>>>>>> 
>>>>>>>>>> Hi Robert -
>>>>>>>>>> 
>>>>>>>>>> As a builder of UI, API and library code who has
also done developer
>>>>>>>>>> training on a variety of technologies, one simple
fix might be go
>>>>>>>>>> ahead and
>>>>>>>>>> not require indexes to be built, but then to put
a big NOTE at the
>>>>>>>>>> beginning of the "Mango Getting Started" guide (I
would assume there
>>>>>>>>>> is
>>>>>>>>>> such a piece of documentation) that states: "Note
that the examples
>>>>>>>>>> in this
>>>>>>>>>> document do not require you to build an index, but
for performance
>>>>>>>>>> reasons
>>>>>>>>>> we HIGHLY RECOMMEND that you do so. *Click here*
for more
>>>> information
>>>>>>>>>> about
>>>>>>>>>> how to do that" (or some such verbiage).
>>>>>>>>>> 
>>>>>>>>>> My 2 cents.
>>>>>>>>>> 
>>>>>>>>>> Cheers,
>>>>>>>>>> 
>>>>>>>>>> - Bill
>>>>>>>>>> 
>>>>>>>>>> On Fri, Jan 8, 2016 at 9:04 AM, Robert Kowalski <rok@kowalski.gd
>>>> <javascript:;>>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi list,
>>>>>>>>>>> 
>>>>>>>>>>> At the end of the mail I would like to invite
the other folks from
>>>>>>>>>>> the
>>>>>>>>>>> mailing list that build interfaces for humans
(APIs, CLIs or even
>>>>>>>>>>> UIs)
>>>>>>>>>>> to chime in again with their opinions. So all
people one the ML,
>>>>>>>>>>> the
>>>>>>>>>>> mail is not just a response to Paul, feedback
is welcome :)
>>>>>>>>>>> 
>>>>>>>>>>> Hi Paul, I agree with the timeout. It could lead
to very unpleasant
>>>>>>>>>>> errors which are hard to debug and support.
>>>>>>>>>>> 
>>>>>>>>>>> I added some thoughts to the other points you
made:
>>>>>>>>>>> 
>>>>>>>>>>>> a) know that the slow queries logs exist,
>>>>>>>>>>> 
>>>>>>>>>>> Hmm... If I take a look at the 1.x logging it
was very
>>>>>>>>>>> straightforward. As a developer you would spin
up a CouchDB and you
>>>>>>>>>>> get all the log messages into your terminal.
It was quite handy in
>>>>>>>>>>> general for all kind of debugging. That the logs
are not displayed
>>>>>>>>>>> directly on stdout/stderr is in my opinion a
general 2.x problem.
>>>>>>>>>>> The
>>>>>>>>>>> problem does occur with all kinds of log message
we produce in
>>>>>>>>>>> CouchDB
>>>>>>>>>>> for 2.x and is not specific to the slow-query-logging.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> Ie, "You can try queries with testing:true,
when you're ready to
>>>>>>>>>>>> move to
>>>>>>>>>>> production you can
>>>>>>>>>>>> POST your selector to _index to create the
index which allows you
>>>>>>>>>>>> to
>>>>>>>>>>>> remove testing:true".
>>>>>>>>>>> 
>>>>>>>>>>> I really like the migration path you mentioned
here with the API to
>>>>>>>>>>> create indexes. I am worried to have a too high
entry barrier for
>>>>>>>>>>> absolute newcomers, people that you want to play
around before they
>>>>>>>>>>> are ready to think about indexes, e.g. by putting
coupling the
>>>>>>>>>>> index
>>>>>>>>>>> topic from the beginning to the querying.
>>>>>>>>>>> 
>>>>>>>>>>> When I throw too much things to learn on people
(which  may not
>>>>>>>>>>> have
>>>>>>>>>>> used a database before), most people get discouraged
and does not
>>>>>>>>>>> take
>>>>>>>>>>> a look. The usual things they feel or say are
: "too complicated",
>>>>>>>>>>> "I
>>>>>>>>>>> have not enough time", "product XY is easier
to use".
>>>>>>>>>>> 
>>>>>>>>>>> I would argue that newcomers to a database will
launch a high
>>>>>>>>>>> traffic,
>>>>>>>>>>> multi-gigabyte product with the database from
day one. Day one is
>>>>>>>>>>> the
>>>>>>>>>>> day where they learn how to query the data and
put data into the
>>>>>>>>>>> database. Even for scenarios where people have
a running high
>>>>>>>>>>> traffic
>>>>>>>>>>> system, and have used other databases at a medium
to large scale I
>>>>>>>>>>> would expect given they migrate to Couch, that
they run both
>>>>>>>>>>> systems
>>>>>>>>>>> in parallel for the first time in order to fix
the issues that
>>>>>>>>>>> occur
>>>>>>>>>>> during a migration.
>>>>>>>>>>> 
>>>>>>>>>>> I think we we share the same goal (getting beginners
started
>>>>>>>>>>> quickly)
>>>>>>>>>>> and the cool thing about your suggestion is that
everyone gets the
>>>>>>>>>>> required knowledge to run a production system
right from the very
>>>>>>>>>>> start. My suggestion leaves some parts out, but
reduces the
>>>>>>>>>>> cognitive
>>>>>>>>>>> load required to get the very first basic results,
e.g. in a
>>>>>>>>>>> university class setting - or junior developers
on their "casual
>>>>>>>>>>> friday 20% time". My big hope is, once those
folks build high
>>>>>>>>>>> traffic
>>>>>>>>>>> systems, they remember how easy the usage of
CouchDB was and that
>>>>>>>>>>> they
>>>>>>>>>>> start to learn more about CouchDB in order to
run it in a system
>>>>>>>>>>> with
>>>>>>>>>>> more than a few thousand documents.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> For us both I think the "what" is clear, but
the "how" is a bit
>>>>>>>>>>> different. I also think this discussion still
makes progress, but I
>>>>>>>>>>> am
>>>>>>>>>>> afraid it could stall. I see that we both have
very good rudiments
>>>>>>>>>>> and
>>>>>>>>>>> I would like to invite the other folks from the
mailing list that
>>>>>>>>>>> build interfaces for humans (APIs, CLIs or even
UIs) to chime in
>>>>>>>>>>> again
>>>>>>>>>>> with their opinions - of course I'm also looking
forward to your
>>>>>>>>>>> answer :)
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Robert :)
>>>>>>>>>>> 
>>>>>>>>>>> On Wed, Jan 6, 2016 at 6:21 PM, Paul Davis
>>>>>>>>>>> <paul.joseph.davis@gmail.com <javascript:;>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> - is a timeout solving the root cause
or the symptoms? Could it
>>>>>>>>>>>>>> be a
>>>>>>>>>>>>>> temporary or additional step as in
conjunction with query
>>>>>>>>>>>>>> optimisation
>>>>>>>>>>>>>> tooling?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> It really depends. From my CouchDB admin
and user perspective,
>>>>>>>>>>>>> this
>>>>>>>>>>>>> doesn't seem so important to me right
now. However, I recognize
>>>>>>>>>>>>> that
>>>>>>>>>>>>> there are different usage scenarios with
different requirents
>>>>>>>>>>>>> (e.g. the
>>>>>>>>>>>>> ones at Cloudant).
>>>>>>>>>>>> 
>>>>>>>>>>>> I don't think there's anything special about
Cloudant in this
>>>>>>>>>>>> discussion. Its just a question of how do
we allow new users the
>>>>>>>>>>>> ability to easily test and learn the selector/query
API while
>>>>>>>>>>>> also
>>>>>>>>>>>> preventing them from going too far without
creating indexes for
>>>>>>>>>>>> their
>>>>>>>>>>>> queries. The slow queries messages are fine,
but just as any
>>>>>>>>>>>> other
>>>>>>>>>>>> database they don't really prompt the developer
to make the
>>>>>>>>>>>> correct
>>>>>>>>>>>> change. Ie, the developer has to be savvy
enough to a) know that
>>>>>>>>>>>> the
>>>>>>>>>>>> slow queries logs exist, b) understand that
creating an index
>>>>>>>>>>>> would
>>>>>>>>>>>> speed things up, and then c) know which index
to create based on
>>>>>>>>>>>> the
>>>>>>>>>>>> logged query.
>>>>>>>>>>>> 
>>>>>>>>>>>> In my experience, the group of users that
we're concerned about
>>>>>>>>>>>> in
>>>>>>>>>>>> this discussion most likely don't know about
any of those three
>>>>>>>>>>>> things, hence why the current API is designed
to force them to
>>>>>>>>>>>> learn
>>>>>>>>>>>> about and understand indexes as part of learning
the API. Granted
>>>>>>>>>>>> the
>>>>>>>>>>>> `_id > null` trick muddies that learning
process. I would think
>>>>>>>>>>>> that
>>>>>>>>>>>> replacing the _id trick with `"testing":
true` or similar would
>>>>>>>>>>>> be an
>>>>>>>>>>>> obvious indication to users that this is
a dev/debug type feature
>>>>>>>>>>>> and
>>>>>>>>>>>> when they went to production they would still
be pushed to using
>>>>>>>>>>>> an
>>>>>>>>>>>> index. If we add the "create index from selector"
API then I
>>>>>>>>>>>> think
>>>>>>>>>>>> this would be a relatively straightforward
method to on ramping
>>>>>>>>>>>> to
>>>>>>>>>>>> both the query and index sides of the API.
Ie, "You can try
>>>>>>>>>>>> queries
>>>>>>>>>>>> with testing:true, when you're ready to move
to production you
>>>>>>>>>>>> can
>>>>>>>>>>>> POST your selector to _index to create the
index which allows you
>>>>>>>>>>>> to
>>>>>>>>>>>> remove testing:true".
>>>>>>>>>>>> 
>>>>>>>>>>>> That's also why I don't particularly care
for the timeout
>>>>>>>>>>>> approach.
>>>>>>>>>>>> It's a binary threshold that a user would
(maybe) meet after some
>>>>>>>>>>>> unknown amount of time after they falsely
believe their app is
>>>>>>>>>>>> working
>>>>>>>>>>>> correctly. The feedback is "Everything is
fine until it isn't".
>>>>>>>>>>>> Consider an app that's been working for a
week or a month or more
>>>>>>>>>>>> that
>>>>>>>>>>>> suddenly starts throwing timeouts for a query.
From the user's
>>>>>>>>>>>> perspective the database broke because the
query that used to
>>>>>>>>>>>> work
>>>>>>>>>>>> fine no longer does. And then there's the
follow on question on
>>>>>>>>>>>> how
>>>>>>>>>>>> that timeout might instruct the user that
they need an index, and
>>>>>>>>>>>> that
>>>>>>>>>>>> the fix may be as easy as POSTing their selector
to the _index
>>>>>>>>>>>> endpoint. Sure Google would most likely have
the answer if our
>>>>>>>>>>>> docs
>>>>>>>>>>>> are good enough, but by that point the developer
is probably
>>>>>>>>>>>> already
>>>>>>>>>>>> experiencing downtime if their app is live
which means they're
>>>>>>>>>>>> frantically trying to fix the thing. From
my point of view, a few
>>>>>>>>>>>> road
>>>>>>>>>>>> blocks that guide developers towards the
correct usage early on
>>>>>>>>>>>> would
>>>>>>>>>>>> be better than letting them get to the adrenaline
fueled
>>>>>>>>>>>> expletive
>>>>>>>>>>>> fountain of downtime.
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 


Mime
View raw message