couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Kuprianov <andrey.koupria...@gmail.com>
Subject Re: Mass updates
Date Thu, 09 May 2013 12:16:42 GMT
Regarding cpu usage limiting. I've just tried cpulimit and it works great.

http://superuser.com/questions/442970/limit-a-processes-cpu-usage-methods


On Thu, May 9, 2013 at 7:18 PM, Robert Newson <rnewson@apache.org> wrote:

>
> http://wiki.apache.org/couchdb/How_to_deploy_view_changes_in_a_live_environment
>
>
> On 9 May 2013 12:16, Andrey Kuprianov <andrey.kouprianov@gmail.com> wrote:
> > Rebuilding the views mentioned by James is hell! And the more docs and
> > views you have, the longer your views will have to catch up with the
> > updates. We dont have the best of the servers, but ours (dedicated) took
> > several hours to rebuild our views (not too many as well) after we
> inserted
> > ~150k documents (we use full text search with Lucene as well, so it also
> > contributed to the overall sever slowdown).
> >
> > So my suggestion is:
> >
> > 1. Once you want to migrate your stuff, make a copy of your db.
> > 2. Do migration on the copy
> > 3. Allow for views to rebuild (you need to query each desing's document
> > single view once to trigger for views to start catching up with the
> > updates). You'd probably ask, if it was possible to limit resource usage
> of
> > Couch, when views are rebuilding, but i dont have answer to that
> question.
> > Maybe someone else can help here...
> > 4. Switch database pointer from one DB to another.
> >
> >
> >
> >
> > On Thu, May 9, 2013 at 1:41 PM, Paul Davis <paul.joseph.davis@gmail.com
> >wrote:
> >
> >> On Wed, May 8, 2013 at 10:24 PM, Charles S. Koppelman-Milstein
> >> <ckoppel@alumni.gwu.edu> wrote:
> >> > I am trying to understand whether Couch is the way to go to meet some
> of
> >> > my organization's needs.  It seems pretty terrific.
> >> > The main concern I have is maintaining a consistent state across code
> >> > releases.  Presumably, our data model will change over the course of
> >> > time, and when it does, we need to make the several million old
> >> > documents conform to the new model.
> >> >
> >> > Although I would love to pipe a view through an update handler and
> call
> >> > it a day, I don't believe that option exists.  The two ways I
> >> > understandto do this are:
> >> >
> >> > 1. Query all documents, update each doc client-side, and PUT those
> >> > changes in the _bulk_docs API (presumably this should be done in
> batches)
> >> > 2. Query the ids for all docs, and one at a time, PUT them through an
> >> > update handler
> >> >
> >>
> >> You are correct that there's no server side way to do a migration like
> >> you're asking for server side.
> >>
> >> The general pattern for these things is to write a view that only
> >> includes the documents that need to be changed and then write
> >> something that goes through and processes each doc in the view to the
> >> desired form (that removes it from the view). This way you can easily
> >> know when you're done working. Its definitely possible to write
> >> something that stores state and/or just brute force a db scan each
> >> time you write run the migration.
> >>
> >> Performance wise, your first suggestion would probably be the most
> >> performant although depending on document sizes and latencies it may
> >> be possible to get better numbers using an update handler but I doubt
> >> it unless you have huge docs and a super slow connection with high
> >> latencies.
> >>
> >> > Are these options reasonably performant?  If we have to do a
> mass-update
> >> > once a deployment, it's not terrible if it's not lightning-speed, but
> it
> >> > shouldn't take terribly long.  Also, I have read that update handlers
> >> > have indexes built against them.  If this is a fire-once option, is
> that
> >> > worthwhile?
> >> >
> >>
> >> I'm not sure what you mean that update handlers have indexes built
> >> against them. That doesn't match anything that currently exist in
> >> CouchDB.
> >>
> >> > Which option is better?  Is there an even better way?
> >> >
> >>
> >> There's nothing better than you're general ideas listed.
> >>
> >> > Thanks,
> >> > Charles
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message