Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A4080119B1 for ; Thu, 9 May 2013 11:19:01 +0000 (UTC) Received: (qmail 44105 invoked by uid 500); 9 May 2013 11:19:00 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 44039 invoked by uid 500); 9 May 2013 11:18:59 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 43891 invoked by uid 99); 9 May 2013 11:18:57 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 May 2013 11:18:57 +0000 Received: from localhost (HELO mail-lb0-f180.google.com) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 May 2013 11:18:56 +0000 Received: by mail-lb0-f180.google.com with SMTP id v1so2880774lbd.39 for ; Thu, 09 May 2013 04:18:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=Polh5vU6de96JHNsSfvEZeDb0yLVE22j6Beb/wU6wes=; b=n4kWkrVyug95gWfWUOWtK1aqPY8i0hm7yDJIaPXBIureeI7jzCVw3MvlFglgTnULGx PvU0+g3Ljaep1zm1HJVrzxfw7iDukfn3i0s01w5wzB2Qv6N73SKR2DiByaNnxLwq8XEe laynmpPOOr45ojN29YTwvjd859UsWIjnDyDS2gBXPcuyBIMZVl0IgNZLJOTg9FkNvexb 3+MCKpgDBVtqQUL3fwdYTFjotFFT61N2qMGk7pMc8ajqgxra19jcOD+Cx0wjI4Rd5f9/ wDNY/sjxcM2ItJ02HiBnR59WvDgNFQgWRERg5LxGMz9LTICuqawUgGKTMLvz3FqZtD4G cHlA== MIME-Version: 1.0 X-Received: by 10.112.72.233 with SMTP id g9mr5156563lbv.131.1368098334867; Thu, 09 May 2013 04:18:54 -0700 (PDT) Received: by 10.112.210.66 with HTTP; Thu, 9 May 2013 04:18:54 -0700 (PDT) In-Reply-To: References: <518B16F5.2040306@alumni.gwu.edu> Date: Thu, 9 May 2013 12:18:54 +0100 Message-ID: Subject: Re: Mass updates From: Robert Newson To: "user@couchdb.apache.org" Content-Type: text/plain; charset=ISO-8859-1 http://wiki.apache.org/couchdb/How_to_deploy_view_changes_in_a_live_environment On 9 May 2013 12:16, Andrey Kuprianov wrote: > Rebuilding the views mentioned by James is hell! And the more docs and > views you have, the longer your views will have to catch up with the > updates. We dont have the best of the servers, but ours (dedicated) took > several hours to rebuild our views (not too many as well) after we inserted > ~150k documents (we use full text search with Lucene as well, so it also > contributed to the overall sever slowdown). > > So my suggestion is: > > 1. Once you want to migrate your stuff, make a copy of your db. > 2. Do migration on the copy > 3. Allow for views to rebuild (you need to query each desing's document > single view once to trigger for views to start catching up with the > updates). You'd probably ask, if it was possible to limit resource usage of > Couch, when views are rebuilding, but i dont have answer to that question. > Maybe someone else can help here... > 4. Switch database pointer from one DB to another. > > > > > On Thu, May 9, 2013 at 1:41 PM, Paul Davis wrote: > >> On Wed, May 8, 2013 at 10:24 PM, Charles S. Koppelman-Milstein >> wrote: >> > I am trying to understand whether Couch is the way to go to meet some of >> > my organization's needs. It seems pretty terrific. >> > The main concern I have is maintaining a consistent state across code >> > releases. Presumably, our data model will change over the course of >> > time, and when it does, we need to make the several million old >> > documents conform to the new model. >> > >> > Although I would love to pipe a view through an update handler and call >> > it a day, I don't believe that option exists. The two ways I >> > understandto do this are: >> > >> > 1. Query all documents, update each doc client-side, and PUT those >> > changes in the _bulk_docs API (presumably this should be done in batches) >> > 2. Query the ids for all docs, and one at a time, PUT them through an >> > update handler >> > >> >> You are correct that there's no server side way to do a migration like >> you're asking for server side. >> >> The general pattern for these things is to write a view that only >> includes the documents that need to be changed and then write >> something that goes through and processes each doc in the view to the >> desired form (that removes it from the view). This way you can easily >> know when you're done working. Its definitely possible to write >> something that stores state and/or just brute force a db scan each >> time you write run the migration. >> >> Performance wise, your first suggestion would probably be the most >> performant although depending on document sizes and latencies it may >> be possible to get better numbers using an update handler but I doubt >> it unless you have huge docs and a super slow connection with high >> latencies. >> >> > Are these options reasonably performant? If we have to do a mass-update >> > once a deployment, it's not terrible if it's not lightning-speed, but it >> > shouldn't take terribly long. Also, I have read that update handlers >> > have indexes built against them. If this is a fire-once option, is that >> > worthwhile? >> > >> >> I'm not sure what you mean that update handlers have indexes built >> against them. That doesn't match anything that currently exist in >> CouchDB. >> >> > Which option is better? Is there an even better way? >> > >> >> There's nothing better than you're general ideas listed. >> >> > Thanks, >> > Charles >>