incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Merrells <j...@merrells.com>
Subject Re: how to tell whether a couchdb database is worth compacting?
Date Sun, 16 May 2010 04:12:12 GMT

On May 15, 2010, at 12:40 PM, Rachel Willmer wrote:

> Last night, I watched a _compact command take 1 hour 40 minutes to
> turn a 38GB couchdb file into a 36GB couchdb file.
> 
> Is there any way of determining whether a compaction is worth
> starting, before you issue the command to do so?
> 
> Also, is there any way of determining what the last time was at which
> a compaction occurred?


I've had to deal with the same problem. Three mitigating things I've done...

1) Sharded each database into 64 buckets, so that I only have to compact
small databases... 

2) Use an external process to keep track of compaction. It kicks off 
compaction of one database, then waits until it finishes, then kicks 
off the next.

3) I keep track of the state that comes back from the database url and
then make an estimate of how much bloat is in there based on the 
update count and average document size. A rough estimate but it
seems to be working.

e.g.
curl -X GET http://localhost/my_db
#=> {"db_name":"my_db", "doc_count":1, "doc_del_count":1, "update_seq":4, "purge_seq":0,
"compact_running":false, "disk_size":12377, "instance_start_time":"1267612389906234", "disk_format_version":5}
John

-- 
John Merrells
http://johnmerrells.com
+1.415.244.5808







Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message