incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Tisdall <tisd...@gmail.com>
Subject Re: dropping revision records
Date Thu, 06 Sep 2012 16:07:28 GMT
I could use the purge to remove a document and then rewrite it back
into the db with the same id.  However, I don't like the idea of it
causing views to be regenerated from scratch (each one I have takes
almost an hour to create from scratch).  Since I have enough room for
a second copy of the DB (since I need it to do a compaction), it makes
more sense to create a new db and leave the old one up and running.
Then when everything is in the new db, I can drop the old one and
replace it with the new one.

My site isn't live at the moment, but once it's live I need to have
the db live and running as long as the site is running (so, as much as
possible).  I also need to maintain the document ids.

On Thu, Sep 6, 2012 at 11:26 AM, Nathan Vander Wilt
<nate-lists@calftrail.com> wrote:
>
> On Sep 6, 2012, at 7:18 AM, Tim Tisdall wrote:
>
>> I had a database of about 10.8gb with almost 15 million records which
>> was fully compacted.  I had to back it up by dumping all the JSON and
>> then restoring it by inserting it back in.  After it was done and I
>> compacted it the database was now only 8.8gb!  I shed 2gb because of
>> dropping the revision stubs still in the database.  This is likely
>> because each record had about 6 revisions (so around 90 million
>> stubs).  All of this is understandable, but 2gb isn't really
>> negligible when running on a virtualized instance of 35gb.  The
>> problem, though, is the method I used to dump to JSON and place it
>> back into couchdb took almost 12hrs!
>>
>> Is there a way to drop all of the revision stubs and reset the
>> document's revision tags back to "1-" values?  I know this would
>> completely break any kind of replication, but in this instance I am
>> not doing any.
>>
>> The best method I can think of is to insert each record into a new DB
>> (not through replication, though, because that takes the stubs over
>> with it).  Then go through the _changes from when I started and recopy
>> those over to make sure everything is up-to-date.  This would save me
>> having things down for 12hrs, but I have no idea how slow this process
>> would take.
>>
>> Suggestions?
>
> You may find http://wiki.apache.org/couchdb/Purge_Documents interesting, however since
it can only purge from leaf nodes you may still need creative application and I'm not sure
what you'd gain over a scripted copy to a different database. What are your uptime/consistency
needs? Must the document ids be preserved?
>
> hth,
> -nvw
>
>

Mime
View raw message