incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Peeters <nicoli...@gmail.com>
Subject Compaction Best Practices
Date Thu, 14 Jun 2012 12:44:26 GMT
I'd like some advice from the community regarding compaction.

*Scenario:*

We have a large-ish CouchDB database that is being used for transactional
logs (very write heavy). Once in a while, we delete some of the records in
large batches and we have scheduled compaction (not automatic (yet)) every
12hours.

>From what I can see, the DB is being hammered significantly every 12 hours
and the compaction is taking 4 hours (with a size of 50-100GB of log data).

*The problem:*

The problem is that compaction takes a very long time and reduces the
performance of the stack. It seems that it's hard for the compaction
process to "keep up" with the insertions, hence why it takes so long. Also,
what I'm not sure is how "incremental" the compaction is...

   1. In this case, would it make sense to run the compaction more often
   (every 10 minutes); since we're write-heavy.
      1. Should we just run more often? (so hopefully it doesn't do
      unnecessary work too often). Actually, in our case, we should probably
      never have automatic compaction if there has been no "termination".
      2. Or actually only once in a while? (bigger batch, but less
      "useless" overhead)
      3. Or should we just wait that a given size (which is the problem
      really) is hit and use the auto compaction (in CouchDB 1.2.0) for this?
   2. In CouchDB 1.2.0 there's a new feature: auto
compaction<http://wiki.apache.org/couchdb/Compaction#Automatic_Compaction>
which
   may be useful for us. There's the "strict_window" feature to give a max
   amount of time to compact and cancel the compaction after that (in order
   not to have it running for 4h+…). I'm wondering what the impact of that is
   on the long run. What if the compaction cannot be completed in that window?

Thanks a lot!

Nicolas

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message