couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Kowalski <...@kowalski.gd>
Subject Re: [POC] Mango Catch All Selector
Date Sat, 13 Feb 2016 01:27:04 GMT
the new behaviour for mango landed this week on master, i hope you all enjoy it!

please report any bugs, problems, feedback and also praise :)

On Mon, Jan 18, 2016 at 11:59 AM, Jan Lehnardt <jan@apache.org> wrote:
> This is awesome: +1
>
>
>> On 18 Jan 2016, at 00:16, Robert Kowalski <rok@kowalski.gd> wrote:
>>
>> Heya,
>>
>> thanks again for all the feedback! I built a prototype and added a demo video!
>>
>>> I think the current design constraint around text is a good one, and I'm
>>> unconvinced including English text is a good direction.
>>>
>>> If you want to take this direction, including a URL to our documentation
>>> instead (which *is* internationalized) is probably a better way to go,
>>> something like:
>>> .... {"_warning": "http://docs.couchdb.org/en/2.0.0/.....”}]
>>
>> I really like this idea! I thought long about it and I think it grows
>> the scope of the current task. Right now all strings CouchDB returns
>> to the user are written in English. The current message that no index
>> exists is also in english. Sadly our documentation is not
>> internationalised yet - afaik no language has a complete translation
>> and the translations are not available as a website or in any other
>> public form. I stopped translating to German myself as the promised
>> integration into the doc build was never finished in ~1.5 years. For
>> the specific task right now I would like to keep the scope as small as
>> possible. This does not mean that I would stand in the way if folks
>> want to add i18n to the project and its sub-projects and have the
>> tooling and time to maintain it.
>>
>>
>> Because a prototype speaks more than 1000 posts I hacked a prototype
>> which includes the warning that was proposed by Garren. You can check
>> it out at https://github.com/apache/couchdb-mango/pull/27 - or watch
>> the video: https://cloudup.com/cEnbWqbX5Y7
>>
>> What do you think?
>>
>> On Wed, Jan 13, 2016 at 11:58 PM, Jan Lehnardt <jan@apache.org> wrote:
>>>
>>>> On 13 Jan 2016, at 23:41, Joan Touzet <wohali@apache.org> wrote:
>>>>
>>>> Warning: If we start using English text in a response such as this, we'll
>>>> need to start externalising strings and internationalising them. We've never
>>>> had to do this before because our API is, in general, terse and relies on
>>>> HTTP status codes to indicate when something has gone wrong.
>>>>
>>>> I think the current design constraint around text is a good one, and I'm
>>>> unconvinced including English text is a good direction.
>>>>
>>>> If you want to take this direction, including a URL to our documentation
>>>> instead (which *is* internationalized) is probably a better way to go,
>>>> something like:
>>>>
>>>> .... {"_warning": "http://docs.couchdb.org/en/2.0.0/.....”}]
>>>
>>> bikeshed: maybe slow_warning (like we use not_found on 404s), but yeah,
>>> something like this!
>>>
>>> Great discussion everyone. I like how we are all making this idea better together
:)
>>>
>>> Best
>>> Jan
>>> --
>>>
>>>
>>>
>>>>
>>>>
>>>>
>>>> ----- Original Message -----
>>>> From: "Robert Kowalski" <rok@kowalski.gd>
>>>> To: dev@couchdb.apache.org
>>>> Sent: Wednesday, January 13, 2016 2:47:27 PM
>>>> Subject: Re: [POC] Mango Catch All Selector
>>>>
>>>> Hi Garren,
>>>>
>>>> what would selector: null do? Return all docs?
>>>>
>>>> Where in the answer from CouchDB would be the warning? Next to the
>>>> resultset, like
>>>>
>>>> [{"_id": "foo", "_rev": "535"}, {"_warning": "slow query, use an index for
>>>> better performance"}] ?
>>>>
>>>> Am Mittwoch, 13. Januar 2016 schrieb Garren Smith :
>>>>
>>>>> Hi Robert,
>>>>>
>>>>> I think you miss understood me, I don’t want it to be a different endpoint.
>>>>> I just don’t want a user to have to do queries like this find({slow:
>>>>> true}). I want them to be able to do a query e.g. find({}) or
>>>>> find({selector: null}) and then get back the results along with a warning
>>>>> message telling them that this query would be slow in production.
>>>>> The lower the barrier for entry here the better. I know we want to protect
>>>>> our users for when they go to production, but forcing them to add a slow:
>>>>> true flag won’t help. It will still require them to read the docs a
lot
>>>>> more than most people are willing to on a first attempt of something
new.
>>>>>
>>>>> Cheers
>>>>> Garren
>>>>>> On 12 Jan 2016, at 9:16 PM, Robert Kowalski <rok@kowalski.gd
>>>>> <javascript:;>> wrote:
>>>>>>
>>>>>> thank you all for your feedback!
>>>>>>
>>>>>> i like the idea of the error message with a new url.
>>>>>>
>>>>>> i agree with garren that it should be a separate endpoint. it takes
>>>>>> some complexity off when explaining each endpoint.
>>>>>>
>>>>>> maybe: `/_find_slow`?
>>>>>>
>>>>>> On Tue, Jan 12, 2016 at 10:36 AM, Jan Lehnardt <jan@apache.org
>>>>> <javascript:;>> wrote:
>>>>>>>
>>>>>>>> On 11 Jan 2016, at 19:55, Tony Sun <tony.sun427@gmail.com
>>>>> <javascript:;>> wrote:
>>>>>>>>
>>>>>>>> Hi Robert,
>>>>>>>>
>>>>>>>> Building upon what others have stated above, what do you
think about
>>>>>>>> the following:
>>>>>>>>
>>>>>>>> 1) Let the user query without creating an index
>>>>>>>> 2) Return an error message with a new url that has
>>>>>>>> "slow/no_index/developer":true appended at the end. The message
clearly
>>>>>>>> explains that this query will be slow, and that creating
an index will
>>>>> be
>>>>>>>> more efficient. However, he or she can continue. The error
message will
>>>>>>>> then have a link to point to our documentation.
>>>>>>>> 3) In Fauxton, there is a checkbox or button that also appends
the
>>>>>>>> "slow/no_index/developer":true to the _find url. If the user
clicks it,
>>>>>>>> then the same message pops up to notify the user.
>>>>>>>
>>>>>>>
>>>>>>> I like this!
>>>>>>>
>>>>>>>
>>>>>>> Jan
>>>>>>> --
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Tony
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jan 11, 2016 at 9:45 AM, Eli Stevens (Gmail) <
>>>>> wickedgrey@gmail.com <javascript:;>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Just wanted to chime in here as a user - I've run into
similar
>>>>>>>>> behavior from CouchDB with the reduce-not-reducing-enough
heuristic,
>>>>>>>>> where stuff I was working on went smoothly in dev, but
stopped once
>>>>>>>>> real load was pushed through it (thankfully for me, that
was in
>>>>>>>>> testing, rather than released to customers).
>>>>>>>>>
>>>>>>>>> It's a frustrating experience, and I don't think that
a reputation for
>>>>>>>>> "works until you cross a threshold, and then it doesn't,
but only in
>>>>>>>>> production" is a good thing to move towards.
>>>>>>>>>
>>>>>>>>> Perhaps something like adding a key to the returned data
along the
>>>>>>>>> lines of "_slow_warning": "This query is going to be
slow on large
>>>>>>>>> data sets. See http://..." in addition to the ?slow_warning=true
>>>>> query
>>>>>>>>> param (note that I'm calling it "slow_warning" in both
places only to
>>>>>>>>> increase discoverability; without the url param, the
no-index query
>>>>>>>>> wouldn't work at all). Bikeshed the name as needed.
>>>>>>>>>
>>>>>>>>> I'd like to see a lot more URLs in CouchDB error messages
in general,
>>>>>>>>> actually - I would find it very useful when trying to
determine what's
>>>>>>>>> going wrong to have a URL right there in the logs that
I can get more
>>>>>>>>> information from.
>>>>>>>>>
>>>>>>>>> On Sun, Jan 10, 2016 at 11:54 AM, Joan Touzet <wohali@apache.org
>>>>> <javascript:;>> wrote:
>>>>>>>>>> Hi Robert,
>>>>>>>>>>
>>>>>>>>>> I've been thinking about this one for the week or
so, and I have a
>>>>>>>>>> simple suggestion:
>>>>>>>>>>
>>>>>>>>>> Add the query parameter slow=true to enable this
behaviour.
>>>>>>>>>>
>>>>>>>>>> This meets all the original requirements:
>>>>>>>>>>
>>>>>>>>>> 1. It is not default behaviour
>>>>>>>>>> 2. You can grep the log files for the word 'slow'
and find evidence
>>>>>>>>>> 3. There is a shorthand, simple way to enable the
behaviour
>>>>>>>>>> 4. Any self-respecting developer will try to remove
slow=true, find
>>>>>>>>>> a break, and be forced to learn about indexes
>>>>>>>>>> 5. It's a bit cheeky, which I think is kind of fun
:D
>>>>>>>>>>
>>>>>>>>>> All the best,
>>>>>>>>>> Joan
>>>>>>>>>>
>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>> From: "William Edney" <bedney@technicalpursuit.com
<javascript:;>>
>>>>>>>>>>> To: dev@couchdb.apache.org <javascript:;>
>>>>>>>>>>> Sent: Friday, January 8, 2016 10:27:29 AM
>>>>>>>>>>> Subject: Re: [POC] Mango Catch All Selector
>>>>>>>>>>>
>>>>>>>>>>> Hi Robert -
>>>>>>>>>>>
>>>>>>>>>>> As a builder of UI, API and library code who
has also done developer
>>>>>>>>>>> training on a variety of technologies, one simple
fix might be go
>>>>>>>>>>> ahead and
>>>>>>>>>>> not require indexes to be built, but then to
put a big NOTE at the
>>>>>>>>>>> beginning of the "Mango Getting Started" guide
(I would assume there
>>>>>>>>>>> is
>>>>>>>>>>> such a piece of documentation) that states: "Note
that the examples
>>>>>>>>>>> in this
>>>>>>>>>>> document do not require you to build an index,
but for performance
>>>>>>>>>>> reasons
>>>>>>>>>>> we HIGHLY RECOMMEND that you do so. *Click here*
for more
>>>>> information
>>>>>>>>>>> about
>>>>>>>>>>> how to do that" (or some such verbiage).
>>>>>>>>>>>
>>>>>>>>>>> My 2 cents.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>> - Bill
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jan 8, 2016 at 9:04 AM, Robert Kowalski
<rok@kowalski.gd
>>>>> <javascript:;>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi list,
>>>>>>>>>>>>
>>>>>>>>>>>> At the end of the mail I would like to invite
the other folks from
>>>>>>>>>>>> the
>>>>>>>>>>>> mailing list that build interfaces for humans
(APIs, CLIs or even
>>>>>>>>>>>> UIs)
>>>>>>>>>>>> to chime in again with their opinions. So
all people one the ML,
>>>>>>>>>>>> the
>>>>>>>>>>>> mail is not just a response to Paul, feedback
is welcome :)
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Paul, I agree with the timeout. It could
lead to very unpleasant
>>>>>>>>>>>> errors which are hard to debug and support.
>>>>>>>>>>>>
>>>>>>>>>>>> I added some thoughts to the other points
you made:
>>>>>>>>>>>>
>>>>>>>>>>>>> a) know that the slow queries logs exist,
>>>>>>>>>>>>
>>>>>>>>>>>> Hmm... If I take a look at the 1.x logging
it was very
>>>>>>>>>>>> straightforward. As a developer you would
spin up a CouchDB and you
>>>>>>>>>>>> get all the log messages into your terminal.
It was quite handy in
>>>>>>>>>>>> general for all kind of debugging. That the
logs are not displayed
>>>>>>>>>>>> directly on stdout/stderr is in my opinion
a general 2.x problem.
>>>>>>>>>>>> The
>>>>>>>>>>>> problem does occur with all kinds of log
message we produce in
>>>>>>>>>>>> CouchDB
>>>>>>>>>>>> for 2.x and is not specific to the slow-query-logging.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Ie, "You can try queries with testing:true,
when you're ready to
>>>>>>>>>>>>> move to
>>>>>>>>>>>> production you can
>>>>>>>>>>>>> POST your selector to _index to create
the index which allows you
>>>>>>>>>>>>> to
>>>>>>>>>>>>> remove testing:true".
>>>>>>>>>>>>
>>>>>>>>>>>> I really like the migration path you mentioned
here with the API to
>>>>>>>>>>>> create indexes. I am worried to have a too
high entry barrier for
>>>>>>>>>>>> absolute newcomers, people that you want
to play around before they
>>>>>>>>>>>> are ready to think about indexes, e.g. by
putting coupling the
>>>>>>>>>>>> index
>>>>>>>>>>>> topic from the beginning to the querying.
>>>>>>>>>>>>
>>>>>>>>>>>> When I throw too much things to learn on
people (which  may not
>>>>>>>>>>>> have
>>>>>>>>>>>> used a database before), most people get
discouraged and does not
>>>>>>>>>>>> take
>>>>>>>>>>>> a look. The usual things they feel or say
are : "too complicated",
>>>>>>>>>>>> "I
>>>>>>>>>>>> have not enough time", "product XY is easier
to use".
>>>>>>>>>>>>
>>>>>>>>>>>> I would argue that newcomers to a database
will launch a high
>>>>>>>>>>>> traffic,
>>>>>>>>>>>> multi-gigabyte product with the database
from day one. Day one is
>>>>>>>>>>>> the
>>>>>>>>>>>> day where they learn how to query the data
and put data into the
>>>>>>>>>>>> database. Even for scenarios where people
have a running high
>>>>>>>>>>>> traffic
>>>>>>>>>>>> system, and have used other databases at
a medium to large scale I
>>>>>>>>>>>> would expect given they migrate to Couch,
that they run both
>>>>>>>>>>>> systems
>>>>>>>>>>>> in parallel for the first time in order to
fix the issues that
>>>>>>>>>>>> occur
>>>>>>>>>>>> during a migration.
>>>>>>>>>>>>
>>>>>>>>>>>> I think we we share the same goal (getting
beginners started
>>>>>>>>>>>> quickly)
>>>>>>>>>>>> and the cool thing about your suggestion
is that everyone gets the
>>>>>>>>>>>> required knowledge to run a production system
right from the very
>>>>>>>>>>>> start. My suggestion leaves some parts out,
but reduces the
>>>>>>>>>>>> cognitive
>>>>>>>>>>>> load required to get the very first basic
results, e.g. in a
>>>>>>>>>>>> university class setting - or junior developers
on their "casual
>>>>>>>>>>>> friday 20% time". My big hope is, once those
folks build high
>>>>>>>>>>>> traffic
>>>>>>>>>>>> systems, they remember how easy the usage
of CouchDB was and that
>>>>>>>>>>>> they
>>>>>>>>>>>> start to learn more about CouchDB in order
to run it in a system
>>>>>>>>>>>> with
>>>>>>>>>>>> more than a few thousand documents.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> For us both I think the "what" is clear,
but the "how" is a bit
>>>>>>>>>>>> different. I also think this discussion still
makes progress, but I
>>>>>>>>>>>> am
>>>>>>>>>>>> afraid it could stall. I see that we both
have very good rudiments
>>>>>>>>>>>> and
>>>>>>>>>>>> I would like to invite the other folks from
the mailing list that
>>>>>>>>>>>> build interfaces for humans (APIs, CLIs or
even UIs) to chime in
>>>>>>>>>>>> again
>>>>>>>>>>>> with their opinions - of course I'm also
looking forward to your
>>>>>>>>>>>> answer :)
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Robert :)
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jan 6, 2016 at 6:21 PM, Paul Davis
>>>>>>>>>>>> <paul.joseph.davis@gmail.com <javascript:;>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> - is a timeout solving the root
cause or the symptoms? Could it
>>>>>>>>>>>>>>> be a
>>>>>>>>>>>>>>> temporary or additional step
as in conjunction with query
>>>>>>>>>>>>>>> optimisation
>>>>>>>>>>>>>>> tooling?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It really depends. From my CouchDB
admin and user perspective,
>>>>>>>>>>>>>> this
>>>>>>>>>>>>>> doesn't seem so important to me right
now. However, I recognize
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> there are different usage scenarios
with different requirents
>>>>>>>>>>>>>> (e.g. the
>>>>>>>>>>>>>> ones at Cloudant).
>>>>>>>>>>>>>
>>>>>>>>>>>>> I don't think there's anything special
about Cloudant in this
>>>>>>>>>>>>> discussion. Its just a question of how
do we allow new users the
>>>>>>>>>>>>> ability to easily test and learn the
selector/query API while
>>>>>>>>>>>>> also
>>>>>>>>>>>>> preventing them from going too far without
creating indexes for
>>>>>>>>>>>>> their
>>>>>>>>>>>>> queries. The slow queries messages are
fine, but just as any
>>>>>>>>>>>>> other
>>>>>>>>>>>>> database they don't really prompt the
developer to make the
>>>>>>>>>>>>> correct
>>>>>>>>>>>>> change. Ie, the developer has to be savvy
enough to a) know that
>>>>>>>>>>>>> the
>>>>>>>>>>>>> slow queries logs exist, b) understand
that creating an index
>>>>>>>>>>>>> would
>>>>>>>>>>>>> speed things up, and then c) know which
index to create based on
>>>>>>>>>>>>> the
>>>>>>>>>>>>> logged query.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In my experience, the group of users
that we're concerned about
>>>>>>>>>>>>> in
>>>>>>>>>>>>> this discussion most likely don't know
about any of those three
>>>>>>>>>>>>> things, hence why the current API is
designed to force them to
>>>>>>>>>>>>> learn
>>>>>>>>>>>>> about and understand indexes as part
of learning the API. Granted
>>>>>>>>>>>>> the
>>>>>>>>>>>>> `_id > null` trick muddies that learning
process. I would think
>>>>>>>>>>>>> that
>>>>>>>>>>>>> replacing the _id trick with `"testing":
true` or similar would
>>>>>>>>>>>>> be an
>>>>>>>>>>>>> obvious indication to users that this
is a dev/debug type feature
>>>>>>>>>>>>> and
>>>>>>>>>>>>> when they went to production they would
still be pushed to using
>>>>>>>>>>>>> an
>>>>>>>>>>>>> index. If we add the "create index from
selector" API then I
>>>>>>>>>>>>> think
>>>>>>>>>>>>> this would be a relatively straightforward
method to on ramping
>>>>>>>>>>>>> to
>>>>>>>>>>>>> both the query and index sides of the
API. Ie, "You can try
>>>>>>>>>>>>> queries
>>>>>>>>>>>>> with testing:true, when you're ready
to move to production you
>>>>>>>>>>>>> can
>>>>>>>>>>>>> POST your selector to _index to create
the index which allows you
>>>>>>>>>>>>> to
>>>>>>>>>>>>> remove testing:true".
>>>>>>>>>>>>>
>>>>>>>>>>>>> That's also why I don't particularly
care for the timeout
>>>>>>>>>>>>> approach.
>>>>>>>>>>>>> It's a binary threshold that a user would
(maybe) meet after some
>>>>>>>>>>>>> unknown amount of time after they falsely
believe their app is
>>>>>>>>>>>>> working
>>>>>>>>>>>>> correctly. The feedback is "Everything
is fine until it isn't".
>>>>>>>>>>>>> Consider an app that's been working for
a week or a month or more
>>>>>>>>>>>>> that
>>>>>>>>>>>>> suddenly starts throwing timeouts for
a query. From the user's
>>>>>>>>>>>>> perspective the database broke because
the query that used to
>>>>>>>>>>>>> work
>>>>>>>>>>>>> fine no longer does. And then there's
the follow on question on
>>>>>>>>>>>>> how
>>>>>>>>>>>>> that timeout might instruct the user
that they need an index, and
>>>>>>>>>>>>> that
>>>>>>>>>>>>> the fix may be as easy as POSTing their
selector to the _index
>>>>>>>>>>>>> endpoint. Sure Google would most likely
have the answer if our
>>>>>>>>>>>>> docs
>>>>>>>>>>>>> are good enough, but by that point the
developer is probably
>>>>>>>>>>>>> already
>>>>>>>>>>>>> experiencing downtime if their app is
live which means they're
>>>>>>>>>>>>> frantically trying to fix the thing.
From my point of view, a few
>>>>>>>>>>>>> road
>>>>>>>>>>>>> blocks that guide developers towards
the correct usage early on
>>>>>>>>>>>>> would
>>>>>>>>>>>>> be better than letting them get to the
adrenaline fueled
>>>>>>>>>>>>> expletive
>>>>>>>>>>>>> fountain of downtime.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>

Mime
View raw message