couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tony Sun <tony.sun...@gmail.com>
Subject Re: [POC] Mango Catch All Selector
Date Mon, 11 Jan 2016 18:55:23 GMT
Hi Robert,

    Building upon what others have stated above, what do you think about
the following:

    1) Let the user query without creating an index
    2) Return an error message with a new url that has
"slow/no_index/developer":true appended at the end. The message clearly
explains that this query will be slow, and that creating an index will be
more efficient. However, he or she can continue. The error message will
then have a link to point to our documentation.
    3) In Fauxton, there is a checkbox or button that also appends the
"slow/no_index/developer":true to the _find url. If the user clicks it,
then the same message pops up to notify the user.



Tony



On Mon, Jan 11, 2016 at 9:45 AM, Eli Stevens (Gmail) <wickedgrey@gmail.com>
wrote:

> Just wanted to chime in here as a user - I've run into similar
> behavior from CouchDB with the reduce-not-reducing-enough heuristic,
> where stuff I was working on went smoothly in dev, but stopped once
> real load was pushed through it (thankfully for me, that was in
> testing, rather than released to customers).
>
> It's a frustrating experience, and I don't think that a reputation for
> "works until you cross a threshold, and then it doesn't, but only in
> production" is a good thing to move towards.
>
> Perhaps something like adding a key to the returned data along the
> lines of "_slow_warning": "This query is going to be slow on large
> data sets. See http://..." in addition to the ?slow_warning=true query
> param (note that I'm calling it "slow_warning" in both places only to
> increase discoverability; without the url param, the no-index query
> wouldn't work at all). Bikeshed the name as needed.
>
> I'd like to see a lot more URLs in CouchDB error messages in general,
> actually - I would find it very useful when trying to determine what's
> going wrong to have a URL right there in the logs that I can get more
> information from.
>
> On Sun, Jan 10, 2016 at 11:54 AM, Joan Touzet <wohali@apache.org> wrote:
> > Hi Robert,
> >
> > I've been thinking about this one for the week or so, and I have a
> > simple suggestion:
> >
> >   Add the query parameter slow=true to enable this behaviour.
> >
> > This meets all the original requirements:
> >
> > 1. It is not default behaviour
> > 2. You can grep the log files for the word 'slow' and find evidence
> > 3. There is a shorthand, simple way to enable the behaviour
> > 4. Any self-respecting developer will try to remove slow=true, find
> >    a break, and be forced to learn about indexes
> > 5. It's a bit cheeky, which I think is kind of fun :D
> >
> > All the best,
> > Joan
> >
> > ----- Original Message -----
> >> From: "William Edney" <bedney@technicalpursuit.com>
> >> To: dev@couchdb.apache.org
> >> Sent: Friday, January 8, 2016 10:27:29 AM
> >> Subject: Re: [POC] Mango Catch All Selector
> >>
> >> Hi Robert -
> >>
> >> As a builder of UI, API and library code who has also done developer
> >> training on a variety of technologies, one simple fix might be go
> >> ahead and
> >> not require indexes to be built, but then to put a big NOTE at the
> >> beginning of the "Mango Getting Started" guide (I would assume there
> >> is
> >> such a piece of documentation) that states: "Note that the examples
> >> in this
> >> document do not require you to build an index, but for performance
> >> reasons
> >> we HIGHLY RECOMMEND that you do so. *Click here* for more information
> >> about
> >> how to do that" (or some such verbiage).
> >>
> >> My 2 cents.
> >>
> >> Cheers,
> >>
> >> - Bill
> >>
> >> On Fri, Jan 8, 2016 at 9:04 AM, Robert Kowalski <rok@kowalski.gd>
> >> wrote:
> >>
> >> > Hi list,
> >> >
> >> > At the end of the mail I would like to invite the other folks from
> >> > the
> >> > mailing list that build interfaces for humans (APIs, CLIs or even
> >> > UIs)
> >> > to chime in again with their opinions. So all people one the ML,
> >> > the
> >> > mail is not just a response to Paul, feedback is welcome :)
> >> >
> >> > Hi Paul, I agree with the timeout. It could lead to very unpleasant
> >> > errors which are hard to debug and support.
> >> >
> >> > I added some thoughts to the other points you made:
> >> >
> >> > > a) know that the slow queries logs exist,
> >> >
> >> > Hmm... If I take a look at the 1.x logging it was very
> >> > straightforward. As a developer you would spin up a CouchDB and you
> >> > get all the log messages into your terminal. It was quite handy in
> >> > general for all kind of debugging. That the logs are not displayed
> >> > directly on stdout/stderr is in my opinion a general 2.x problem.
> >> > The
> >> > problem does occur with all kinds of log message we produce in
> >> > CouchDB
> >> > for 2.x and is not specific to the slow-query-logging.
> >> >
> >> >
> >> > > Ie, "You can try queries with testing:true, when you're ready to
> >> > > move to
> >> > production you can
> >> > > POST your selector to _index to create the index which allows you
> >> > > to
> >> > > remove testing:true".
> >> >
> >> > I really like the migration path you mentioned here with the API to
> >> > create indexes. I am worried to have a too high entry barrier for
> >> > absolute newcomers, people that you want to play around before they
> >> > are ready to think about indexes, e.g. by putting coupling the
> >> > index
> >> > topic from the beginning to the querying.
> >> >
> >> > When I throw too much things to learn on people (which  may not
> >> > have
> >> > used a database before), most people get discouraged and does not
> >> > take
> >> > a look. The usual things they feel or say are : "too complicated",
> >> > "I
> >> > have not enough time", "product XY is easier to use".
> >> >
> >> > I would argue that newcomers to a database will launch a high
> >> > traffic,
> >> > multi-gigabyte product with the database from day one. Day one is
> >> > the
> >> > day where they learn how to query the data and put data into the
> >> > database. Even for scenarios where people have a running high
> >> > traffic
> >> > system, and have used other databases at a medium to large scale I
> >> > would expect given they migrate to Couch, that they run both
> >> > systems
> >> > in parallel for the first time in order to fix the issues that
> >> > occur
> >> > during a migration.
> >> >
> >> > I think we we share the same goal (getting beginners started
> >> > quickly)
> >> > and the cool thing about your suggestion is that everyone gets the
> >> > required knowledge to run a production system right from the very
> >> > start. My suggestion leaves some parts out, but reduces the
> >> > cognitive
> >> > load required to get the very first basic results, e.g. in a
> >> > university class setting - or junior developers on their "casual
> >> > friday 20% time". My big hope is, once those folks build high
> >> > traffic
> >> > systems, they remember how easy the usage of CouchDB was and that
> >> > they
> >> > start to learn more about CouchDB in order to run it in a system
> >> > with
> >> > more than a few thousand documents.
> >> >
> >> >
> >> > For us both I think the "what" is clear, but the "how" is a bit
> >> > different. I also think this discussion still makes progress, but I
> >> > am
> >> > afraid it could stall. I see that we both have very good rudiments
> >> > and
> >> > I would like to invite the other folks from the mailing list that
> >> > build interfaces for humans (APIs, CLIs or even UIs) to chime in
> >> > again
> >> > with their opinions - of course I'm also looking forward to your
> >> > answer :)
> >> >
> >> > Best,
> >> > Robert :)
> >> >
> >> > On Wed, Jan 6, 2016 at 6:21 PM, Paul Davis
> >> > <paul.joseph.davis@gmail.com>
> >> > wrote:
> >> > >>> - is a timeout solving the root cause or the symptoms? Could
it
> >> > >>> be a
> >> > >>> temporary or additional step as in conjunction with query
> >> > >>> optimisation
> >> > >>> tooling?
> >> > >>
> >> > >> It really depends. From my CouchDB admin and user perspective,
> >> > >> this
> >> > >> doesn't seem so important to me right now. However, I recognize
> >> > >> that
> >> > >> there are different usage scenarios with different requirents
> >> > >> (e.g. the
> >> > >> ones at Cloudant).
> >> > >
> >> > > I don't think there's anything special about Cloudant in this
> >> > > discussion. Its just a question of how do we allow new users the
> >> > > ability to easily test and learn the selector/query API while
> >> > > also
> >> > > preventing them from going too far without creating indexes for
> >> > > their
> >> > > queries. The slow queries messages are fine, but just as any
> >> > > other
> >> > > database they don't really prompt the developer to make the
> >> > > correct
> >> > > change. Ie, the developer has to be savvy enough to a) know that
> >> > > the
> >> > > slow queries logs exist, b) understand that creating an index
> >> > > would
> >> > > speed things up, and then c) know which index to create based on
> >> > > the
> >> > > logged query.
> >> > >
> >> > > In my experience, the group of users that we're concerned about
> >> > > in
> >> > > this discussion most likely don't know about any of those three
> >> > > things, hence why the current API is designed to force them to
> >> > > learn
> >> > > about and understand indexes as part of learning the API. Granted
> >> > > the
> >> > > `_id > null` trick muddies that learning process. I would think
> >> > > that
> >> > > replacing the _id trick with `"testing": true` or similar would
> >> > > be an
> >> > > obvious indication to users that this is a dev/debug type feature
> >> > > and
> >> > > when they went to production they would still be pushed to using
> >> > > an
> >> > > index. If we add the "create index from selector" API then I
> >> > > think
> >> > > this would be a relatively straightforward method to on ramping
> >> > > to
> >> > > both the query and index sides of the API. Ie, "You can try
> >> > > queries
> >> > > with testing:true, when you're ready to move to production you
> >> > > can
> >> > > POST your selector to _index to create the index which allows you
> >> > > to
> >> > > remove testing:true".
> >> > >
> >> > > That's also why I don't particularly care for the timeout
> >> > > approach.
> >> > > It's a binary threshold that a user would (maybe) meet after some
> >> > > unknown amount of time after they falsely believe their app is
> >> > > working
> >> > > correctly. The feedback is "Everything is fine until it isn't".
> >> > > Consider an app that's been working for a week or a month or more
> >> > > that
> >> > > suddenly starts throwing timeouts for a query. From the user's
> >> > > perspective the database broke because the query that used to
> >> > > work
> >> > > fine no longer does. And then there's the follow on question on
> >> > > how
> >> > > that timeout might instruct the user that they need an index, and
> >> > > that
> >> > > the fix may be as easy as POSTing their selector to the _index
> >> > > endpoint. Sure Google would most likely have the answer if our
> >> > > docs
> >> > > are good enough, but by that point the developer is probably
> >> > > already
> >> > > experiencing downtime if their app is live which means they're
> >> > > frantically trying to fix the thing. From my point of view, a few
> >> > > road
> >> > > blocks that guide developers towards the correct usage early on
> >> > > would
> >> > > be better than letting them get to the adrenaline fueled
> >> > > expletive
> >> > > fountain of downtime.
> >> >
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message