incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: unexpected behavior in view generation
Date Mon, 28 Sep 2009 21:20:56 GMT
On Mon, Sep 28, 2009 at 5:10 PM, James Marca
<jmarca@translab.its.uci.edu> wrote:
> Hi All,
>
> I ran across some behavior this weekend that I didn't expect.
>
> I have a lot of data, and I'm trialing storing the raw data in
> CouchDB.  I have 12 databases (manually "sharded" by month), each with
> about 115,000,000 documents, and an average size of about 25G.  I
> created a view index for each database, and was updating away, when I
> noticed a typo in the October database.  I corrected the typo (in
> Futon), reloaded the view page (again in futon), and *assumed* that
> the old view indexing job was killed and restarted with the new code.
> In fact, what happened was that the old, incorrect index job kept
> running (for 48 hours) and when it finished, it restarted.
>
> Is this a minor bug, or did I miss an option somewhere?
>

I'd call it a bad inconvenience for large data sets. Technically what
happened was that it went through and indexed all of your data with
the bad version, then the last edit was your change to the design doc,
which reset the indexes and started things over which is the intended
behavior.

Just using my brain debugger it seems like it should be fairly easy to
crash (Erlang speak for 'shutdown gracefully') any running update
processes. Feel free to create a ticket in JIRA for the feature. Put a
note on it that it'd be good for anyone wanting to learn the view
update engine code as all of that would be mostly towards the top of
the stack.

> (The machine I am running this on is running version 0.9.0.  I will
> test something similar in the near future on 0.9.1 machine.)
>

Everything all the way up to trunk will behave the same way as 0.9.0.

> On the plus side, Erlang and CouchDB happily cranked away both loading
> the data and then building the views on my 8-core machine, using up
> lots of the available processor resources (instead of just max-ing out
> one core).  I still wish I could spawn multiple threads per index job,
> but since there's no way I could write that code, I'll wait.
>

The view indexer in trunk will now split view updates over two cores
and is faster than 0.9's indexer. I'm uncertain if this work was back
ported to 0.10 or not.

> Regards,
> James Marca
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>

Paul Davis

Mime
View raw message