incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Our views keep failing with timeouts until a couchdb restart
Date Thu, 25 Aug 2011 02:55:06 GMT
On Wed, Aug 24, 2011 at 9:41 PM, Jason Smith <jhs@iriscouch.com> wrote:
> (Migrated to user@)
>
> On Thu, Aug 25, 2011 at 4:05 AM, Chris Stockton
> <chrisstocktonaz@gmail.com> wrote:
>> Hello,
>>
>> On Wed, Aug 24, 2011 at 1:53 PM, Paul Davis <paul.joseph.davis@gmail.com> wrote:
>>> I bet you're hitting a bug we just recently fixed on trunk. Basically,
>>> there was a possibility that errors in some of the JS functions would
>>> end up causing a couchjs process to be come unusable without removing
>>> it from the list. Eventually there wouldn't be any spots left and
>>> clients would get timeouts like you see.
>>>
>>> Patch is at [1]. If it doesn't apply cleanly, you really only need the
>>> bits from couch_os_process.erl and couch_query_servers.erl. The rest
>>> is just test code.
>>>
>>> https://github.com/apache/couchdb/commit/95da6f6f4246d2e8e86a3cf92ddf6487e46c10e9
>>>
>>
>> Right after sending this I finally saw what the issue was, our bug
>> report here: https://issues.apache.org/jira/browse/COUCHDB-1257?focusedCommentId=13090484#comment-13090484
>> as a side effect was leaving lingering processes ultimately leading to
>> instability of couch.
>>
>> We are working to patch our reduce issue, and will look at applying
>> that commit perhaps once it hits mainline couch?
>
> Chris, I'm glad those bugs are identified and will be fixed soon. But
> in the meantime, perhaps you can change your code to add robustness? I
> can think of two ideas.
>
> 1. CouchDB has an odd, idiosyncratic, feature where GET queries
> produce side-effects. From HTTP, there are no side-effects, but as you
> can see, GETting a view can spawn couches processes and write files to
> the disk. Perhaps you could add ?stale=ok to all of your queries used
> in production. To my knowledge, stale=ok guarantees that couchjs will
> not be involved in servicing that query. This protects your users from
> seeing map/reduce errors. The down-side is that you must of course
> query the views yourself to keep them current.
>

No, couchjs will still be required to service reductions. stale=ok
just means you don't have to wait for a possibly lengthy view build.

Also, "GET's cause side effects" is a bit misleading I think. A GET on
a view by default tries to return data from something equal to or
newer than the current database update_seq. If the current view state
hasn't caught up to the db's update_seq it waits. Classifying this as
"GET's have side effects" is really like saying "GET to a Rails APP
causes side effects because you have to wait for the template to
render."

For those familiar with internals its quite true that if the indexer
isn't running, a new GET request will trigger an update. But say we
change that so that view updates are tried every N seconds and a GET
itself will never trigger an update but might endure a delay of N-1
seconds before anything starts happening. You'd be hard pressed to say
that the "GET caused side effects" in that case yet the observable
behavior is the same: "sometimes it takes a while."

> 2. Design documents should never be published (used in production)
> until their views are fully built. This is not a CouchDB bug, but
> rather a lack of tooling. The technique is pretty simple. Publish your
> design document under an alternative id: _design/example_staging.
> Next, query the views (which you are conveniently already doing in
> step 1!). When the views are fresh, with no bugs and everything looks
> good, query with HTTP COPY to promote _design/example_staging to
> _design/example. That is an atomic software upgrade; and views will be
> ready instantly. Not bad!
>

"Should never be ..." is a bit too prescriptive for my taste. Bottom
line, views can take a non trivial amount of time to build. Beware.
Though the advice for pre-building views and promoting is spot on if
you anticipate long view builds.

> Perhaps these ideas will help you work around your bugs. IMO, they are
> good general policies anyway.
>
> --
> Iris Couch
>

Mime
View raw message