couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Smith <...@iriscouch.com>
Subject Re: Our views keep failing with timeouts until a couchdb restart
Date Thu, 25 Aug 2011 02:41:31 GMT
(Migrated to user@)

On Thu, Aug 25, 2011 at 4:05 AM, Chris Stockton
<chrisstocktonaz@gmail.com> wrote:
> Hello,
>
> On Wed, Aug 24, 2011 at 1:53 PM, Paul Davis <paul.joseph.davis@gmail.com> wrote:
>> I bet you're hitting a bug we just recently fixed on trunk. Basically,
>> there was a possibility that errors in some of the JS functions would
>> end up causing a couchjs process to be come unusable without removing
>> it from the list. Eventually there wouldn't be any spots left and
>> clients would get timeouts like you see.
>>
>> Patch is at [1]. If it doesn't apply cleanly, you really only need the
>> bits from couch_os_process.erl and couch_query_servers.erl. The rest
>> is just test code.
>>
>> https://github.com/apache/couchdb/commit/95da6f6f4246d2e8e86a3cf92ddf6487e46c10e9
>>
>
> Right after sending this I finally saw what the issue was, our bug
> report here: https://issues.apache.org/jira/browse/COUCHDB-1257?focusedCommentId=13090484#comment-13090484
> as a side effect was leaving lingering processes ultimately leading to
> instability of couch.
>
> We are working to patch our reduce issue, and will look at applying
> that commit perhaps once it hits mainline couch?

Chris, I'm glad those bugs are identified and will be fixed soon. But
in the meantime, perhaps you can change your code to add robustness? I
can think of two ideas.

1. CouchDB has an odd, idiosyncratic, feature where GET queries
produce side-effects. From HTTP, there are no side-effects, but as you
can see, GETting a view can spawn couches processes and write files to
the disk. Perhaps you could add ?stale=ok to all of your queries used
in production. To my knowledge, stale=ok guarantees that couchjs will
not be involved in servicing that query. This protects your users from
seeing map/reduce errors. The down-side is that you must of course
query the views yourself to keep them current.

2. Design documents should never be published (used in production)
until their views are fully built. This is not a CouchDB bug, but
rather a lack of tooling. The technique is pretty simple. Publish your
design document under an alternative id: _design/example_staging.
Next, query the views (which you are conveniently already doing in
step 1!). When the views are fresh, with no bugs and everything looks
good, query with HTTP COPY to promote _design/example_staging to
_design/example. That is an atomic software upgrade; and views will be
ready instantly. Not bad!

Perhaps these ideas will help you work around your bugs. IMO, they are
good general policies anyway.

-- 
Iris Couch

Mime
View raw message