couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benoit Chesneau <bchesn...@gmail.com>
Subject Re: Mass updates
Date Sat, 11 May 2013 05:02:08 GMT
On May 9, 2013 1:17 PM, "Andrey Kuprianov" <andrey.kouprianov@gmail.com>
wrote:
>
> Rebuilding the views mentioned by James is hell! And the more docs and
> views you have, the longer your views will have to catch up with the
> updates. We dont have the best of the servers, but ours (dedicated) took
> several hours to rebuild our views (not too many as well) after we
inserted
> ~150k documents (we use full text search with Lucene as well, so it also
> contributed to the overall sever slowdown).
>
> So my suggestion is:
>
> 1. Once you want to migrate your stuff, make a copy of your db.
> 2. Do migration on the copy
> 3. Allow for views to rebuild (you need to query each desing's document
> single view once to trigger for views to start catching up with the
> updates). You'd probably ask, if it was possible to limit resource usage
of
> Couch, when views are rebuilding, but i dont have answer to that question.
> Maybe someone else can help here...
> 4. Switch database pointer from one DB to another.
>
>

You don' t need to wait that all the docs are here to triggerthe viewupdat,
Jus trigger it more often. So view calculation will happen on smaller set.

You caneven make it //by using different ddocs.
>
>
> On Thu, May 9, 2013 at 1:41 PM, Paul Davis <paul.joseph.davis@gmail.com
>wrote:
>
> > On Wed, May 8, 2013 at 10:24 PM, Charles S. Koppelman-Milstein
> > <ckoppel@alumni.gwu.edu> wrote:
> > > I am trying to understand whether Couch is the way to go to meet some
of
> > > my organization's needs.  It seems pretty terrific.
> > > The main concern I have is maintaining a consistent state across code
> > > releases.  Presumably, our data model will change over the course of
> > > time, and when it does, we need to make the several million old
> > > documents conform to the new model.
> > >
> > > Although I would love to pipe a view through an update handler and
call
> > > it a day, I don't believe that option exists.  The two ways I
> > > understandto do this are:
> > >
> > > 1. Query all documents, update each doc client-side, and PUT those
> > > changes in the _bulk_docs API (presumably this should be done in
batches)
> > > 2. Query the ids for all docs, and one at a time, PUT them through an
> > > update handler
> > >
> >
> > You are correct that there's no server side way to do a migration like
> > you're asking for server side.
> >
> > The general pattern for these things is to write a view that only
> > includes the documents that need to be changed and then write
> > something that goes through and processes each doc in the view to the
> > desired form (that removes it from the view). This way you can easily
> > know when you're done working. Its definitely possible to write
> > something that stores state and/or just brute force a db scan each
> > time you write run the migration.
> >
> > Performance wise, your first suggestion would probably be the most
> > performant although depending on document sizes and latencies it may
> > be possible to get better numbers using an update handler but I doubt
> > it unless you have huge docs and a super slow connection with high
> > latencies.
> >
> > > Are these options reasonably performant?  If we have to do a
mass-update
> > > once a deployment, it's not terrible if it's not lightning-speed, but
it
> > > shouldn't take terribly long.  Also, I have read that update handlers
> > > have indexes built against them.  If this is a fire-once option, is
that
> > > worthwhile?
> > >
> >
> > I'm not sure what you mean that update handlers have indexes built
> > against them. That doesn't match anything that currently exist in
> > CouchDB.
> >
> > > Which option is better?  Is there an even better way?
> > >
> >
> > There's nothing better than you're general ideas listed.
> >
> > > Thanks,
> > > Charles
> >

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message