couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Graham Bull <calzakk+couc...@gmail.com>
Subject Re: Indexes are inaccessible while indexers are running
Date Tue, 01 Nov 2016 21:39:49 GMT
Hi Adam,

Thanks for your reply. I'll have a look at the option when I get a chance,
looks like it could be useful.

I'm a little concerned though. You say the indexes don't update
automatically. What happens when values in indexed fields change?
Presumably the indexes aren't therefore updated, in which case they must be
periodically manually recreated?

I'm running the latest version, 2.0.0, on a decent desktop machine (i7, 8
cores, 16GB RAM, SSDs). Going forward, our application would run on higher
spec servers.

FYI, the indexers actually took close to 4 hours to do all the indexing,
not the 2.5 I thought it would take. One thing I didn't mention was that
I'd created 4 separate indexes, and it's possible we'd need more. It's
looking likely that CouchDB isn't a good fit for what we're doing, which is
a shame because it has lots of positives.

Graham



On 1 November 2016 at 16:08, Adam Kocoloski <kocolosk@apache.org> wrote:

> Hi Graham, the indexes don’t update automatically, but it is possible to
> prime the indexers by issuing a query to the view at any point during the
> import. One interesting option for you is the "?stale=update_after" flag,
> which will respond with the current state of the view index and trigger a
> background update of the indexes after the fact:
>
> GET /<db>/_design/<ddoc>/_view/<view>?stale=update_after
>
> You could also add a &limit=0 if you’re only interested in priming the
> indexers.
>
> As far as indexing performance is concerned … ~4500 docs/second isn’t
> awesome, but the devil is in the details: how many times is each document
> indexed? Does the server have adequate CPU and IO? Are you running 2.0 or
> one of the 1.x versions?
>
> I can dig up some benchmarks but I’m certain I’ve seen (Linux) systems
> index several times faster than that. I haven’t seen a lot of extensive
> performance testing on Windows though. Cheers,
>
> Adam
>
> > On Oct 31, 2016, at 7:41 AM, Graham Bull <calzakk+couchdb@gmail.com>
> wrote:
> >
> > Hello,
> >
> > I'm currently evaluating CouchDB (and other NoSQL databases).
> >
> > I have a number of databases of various sizes. After restarting the
> CouchDB
> > service (I'm on Windows) eight "indexer" tasks started running on the
> > largest database (40 million documents), which was recently imported.
> >
> > After 30 minutes the progress on all tasks is 20%. In the meantime I
> can't
> > run any queries using the database's indexes. At this rate, it'll take
> > around 2.5 hours to index the entire database.
> >
> > Presumably, when indexes are created, they're initially empty? And the
> > indexer tasks are required to do the actual indexing? If so, then the
> > performance is pretty bad. It took nearly 2 hours to import the 40
> million
> > records. Add on index creation, and you're looking at 4.5 hours. Without
> > mentioning other relational and NoSQL databases by name, or giving any
> > stats, CouchDB's import and indexing performance is pretty bad in
> > comparison.
> >
> > Is there a way to force the indexers to run immediately after importing
> the
> > data, and to query the indexing status so that my app can wait until it's
> > completed?
> >
> > Thanks in advance,
> >
> > Graham
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message