incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Coallier <dav...@php.net>
Subject Re: Spontaneous reindex and View Group Indexer Finishing taking a long time
Date Fri, 01 Oct 2010 11:19:27 GMT
On 1 October 2010 09:36, Adrian Pemsel <apemsel@gmail.com> wrote:
> Hi,
>
> We use CouchDB for generating community and user feeds of our small social
> network since a couple of months with great success and performance.
> Yesterday however we noticed a strange behavior on our staging and live
> systems we do not yet fully understand. After a design document change all
> views were warmed so all indexes were recreated in a reasonable amount of
> time (couple of minutes), as expected. During the following testing CouchDB
> occasionally reindexed the whole design document again instead of doing
> incremential updates to the index as usual, which of course is a bad thing
> to happen on a live system. We are not yet sure what triggered this and
> cannot reproduce the behavior in a deterministic way, but it might have to
> do with deleting single documents (I will keep you updated on any new
> discoveries). We also noticed that while the PROCESSING phase of the reindex
> again happened with reasonable speed, VIEW GROUP INDEXER was in the
> "FINISHING" state for a very long time (more than 5 minutes), blocking all
> views.
> My questions are:
>
> 1. Under which conditions apart from a design document change would CouchDB
> reindex the whole view group instead of an incremental update?
> 2. Is there a common mistake that causes the reindex to hang for a long time
> in the FINISHING state.
>
> A bit more background info: This happened on both 1.0.0 with delayed_commits
> false as well as on 1.0.1 with delayed_commits on. The views are rather
> complicated maps with multiple emits but with simple "_sum" reduces.
>
> This might just as well be a bug in our code so no offence meant to the
> fantastic couchdb community ;-) Thanks for your help,
>

I would double-check that the fields you are using in your map and
reduce functions both exist and are of good type.

For instance, consider this document:
{
    _id: "xxx",
    "name": "example1",
    "age": 23
}

and

{
    _id: "yyy",
    "name": "example2"
}

Then the following map and reduce functions:
map = function(doc) {
    emit(doc.name, doc.age);
}

reduce = function (key, values) {
    return sum(values);
}

Note: You should only use "reduce: _sum" which is taking advantage of
the native erlang functions.

This previous case is going to generate an exception in the javascript
engine, stopping the indexing and re-spawning another process picking
up the indexing where it was. This caused some massive interruptions
in our indexing.

In order to fix that issue, what you have to do is to make sure you
are using valid fields and fields of good type.

So your function would look more like:

map = function (doc) {
    if (doc.name && doc.age && typeof(parseInt(doc.age)) == "number"
&& !isNaN(parseInt(doc.age))) {
        emit(doc.name, parseInt(doc.age));
    }
};

reduce = function(key, values) { return sum(values); }

This is obviously an example but it outlines how the doc.age is
important as you are going to be executing a math operation on the
emit values (And sum() the javascript function expects a type :
number).

I hope that correctly validating the type of values you are using
helps your problem :)

-- 
David Coallier

Mime
View raw message