couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick North <>
Subject Re: Replicated database size
Date Wed, 16 May 2012 14:49:09 GMT
Following up on my own email, this seems to be an issue with snappy on
Windows Server 2008. When I changed the file_compression setting to
deflate_6, the "large" databases went down from 7GB to 1GB after
compaction. I'm not entirely sure if this counts as a bug so I won't raise
an issue on it.
By the way: kudos to whoever wrote the code to deal with file_compression.
When I changed file_compression to deflate_6, the system happily worked
with the existing, supposedly snappy-compressed databases, and converted
format on the next compaction. That could have gone wrong in several ways,
but didn't, so thank you.

On 15 May 2012 13:55, Nick North <> wrote:

> I'm curious about the size of replicated CouchDb databases in comparison
> to each other. I have four databases, each with pull replications from the
> other three, but they report quite different data sizes. Two of them say:
> {"db_name":"hydra","doc_count":1489060,"doc_del_count":2754893,"update_seq":6998882,"purge_seq":0,"compact_running":false,"disk_size":3213656193,"data_size":1395943755,"instance_start_time":"1337067567481841","disk_format_version":6,"committed_update_seq":6998882}
> While the other two say this - note the difference in data_size:
> {"db_name":"hydra","doc_count":1489441,"doc_del_count":2755302,"update_seq":4375865,"purge_seq":0,"compact_running":false,"disk_size":7599413027,"data_size":7265993199,"instance_start_time":"1337014746154865","disk_format_version":6,"committed_update_seq":4375865}
> (There is some discrepancy in the doc_count because new documents are being posted continuously,
and some went in in between fetching stats for the various instances.) Other possibly relevant
>    - All the replications appear to be in working order so I don't believe there is a
backlog of documents waiting to be replicated.
>    - The database has just one design view and whether or not it has been queried does
not seem to make any difference to whether the database is "large" or "small".
>    - Compaction makes little difference, in that the "large" instances always remain
much larger than the "small" ones.
>    - Everything is running CouchDb 1.2 on Windows: the "small" instances on Windows 7
and Windows Vista, and the "large" ones on Windows Server 2008.
>    - File_compression is set to "snappy" in all cases and there are no attachments anywhere.
> Can anyone suggest what might be going on here? My best guess is that it's to do with
file compression on Windows Server but that is a guess, so I'm intending to do some experimentation
with the other file compression options. I'd be grateful for any thoughts, as I'm planning
out disk requirements for a system with ten times the capacity of the current one, and would
very much like to be do that with some certainty about file sizes. Thanks in advance for any
> Nick North

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message