incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Tisdall <tisd...@gmail.com>
Subject Re: Compaction Best Practices
Date Thu, 14 Jun 2012 13:08:27 GMT
I think he's suggesting avoiding compaction completely.  Just delete
the old DB when you've finished deleting all the records.

On Thu, Jun 14, 2012 at 9:05 AM, Nicolas Peeters <nicolists@gmail.com> wrote:
> Interesting suggestion. However, this would perhaps have the same effect
> (deleting/compacting the old DB is what makes the system slower)...?
>
> On Thu, Jun 14, 2012 at 2:54 PM, Robert Newson <rnewson@apache.org> wrote:
>
>> Do you eventually delete every document you add?
>>
>> If so, consider using a rolling database scheme instead. At some
>> point, perhaps daily, start a new database and write new transaction
>> logs there. Continue deleting old logs from the previous database(s)
>> until they're empty (doc_count:0) and then delete the database.
>>
>> B.
>>
>> On 14 June 2012 13:44, Nicolas Peeters <nicolists@gmail.com> wrote:
>> > I'd like some advice from the community regarding compaction.
>> >
>> > *Scenario:*
>> >
>> > We have a large-ish CouchDB database that is being used for transactional
>> > logs (very write heavy). Once in a while, we delete some of the records
>> in
>> > large batches and we have scheduled compaction (not automatic (yet))
>> every
>> > 12hours.
>> >
>> > From what I can see, the DB is being hammered significantly every 12
>> hours
>> > and the compaction is taking 4 hours (with a size of 50-100GB of log
>> data).
>> >
>> > *The problem:*
>> >
>> > The problem is that compaction takes a very long time and reduces the
>> > performance of the stack. It seems that it's hard for the compaction
>> > process to "keep up" with the insertions, hence why it takes so long.
>> Also,
>> > what I'm not sure is how "incremental" the compaction is...
>> >
>> >   1. In this case, would it make sense to run the compaction more often
>> >   (every 10 minutes); since we're write-heavy.
>> >      1. Should we just run more often? (so hopefully it doesn't do
>> >      unnecessary work too often). Actually, in our case, we should
>> probably
>> >      never have automatic compaction if there has been no "termination".
>> >      2. Or actually only once in a while? (bigger batch, but less
>> >      "useless" overhead)
>> >      3. Or should we just wait that a given size (which is the problem
>> >      really) is hit and use the auto compaction (in CouchDB 1.2.0) for
>> this?
>> >   2. In CouchDB 1.2.0 there's a new feature: auto
>> > compaction<
>> http://wiki.apache.org/couchdb/Compaction#Automatic_Compaction>
>> > which
>> >   may be useful for us. There's the "strict_window" feature to give a max
>> >   amount of time to compact and cancel the compaction after that (in
>> order
>> >   not to have it running for 4h+…). I'm wondering what the impact of
>> that is
>> >   on the long run. What if the compaction cannot be completed in that
>> window?
>> >
>> > Thanks a lot!
>> >
>> > Nicolas
>>

Mime
View raw message