couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Calle Arnesten <calle.arnes...@codekick.com>
Subject Re: Bug? Database compaction keeps re-running continuously on CouchDB 1.4
Date Mon, 07 Oct 2013 16:48:54 GMT
Thanks Paul! Then I will look forward to the compaction improvements in the Nebraska branch,
as well as the other BigCouch stuff.

/Calle

On Mon, Oct 7, 2013, at 13:19, Paul J Davis wrote:
> IIRC we're not exactly right on the free space calculation but more importantly we also
generate garbage while compacting. Specifically the id_tree updates cause a lot of fragmentation
when docs are updated in a random order.
> 
> The compactor on the Nebraska-merge branch was rewritten to avoid this and was a significant
improvement in many cases. 
> 
> > On Oct 5, 2013, at 9:33 AM, Calle Arnesten <calle.arnesten@codekick.com> wrote:
> > 
> > Robert, thanks for your reply. 
> > 
> > I wasn't aware of the database footers, and then I can understand that an endless
compaction could happen if the value is set too low. But I get these endless loop even if
I raise to as high as 60%. To me that's not intuitive.
> > 
> > Before, I had it set to 70% and then I didn't get these endless compaction loops,
but then I in general consumed a lot more disk space than I do now. 
> > 
> > To me, at least, it would be more intuitive if the number stood for how much unnecessary
space that was allowed before compaction takes place. So for example if I had a 10GB database
file and it was 20% fragmented, it would after compaction be 8GB and 0% fragmented. It might
(?) be harder to calculate the numbers that way, but it would be much easier to reason about
when configuring your database server.
> > 
> > /Calle
> > 
> >> On Sat, Oct 5, 2013, at 10:26, Robert Newson wrote:
> >> 
> >> It makes intuitive sense that setting that % too low will cause endless (and
pointless) compactions (the ratio of disk_size to data_size exceeding your % immediately after
compaction). I'm fairly sure, for example, that the data_size value does not include the space
consumed by the many database footers in the file.
> >> 
> >> B.
> >> 
> >>> On 5 Oct 2013, at 07:43, Calle Arnesten <calle.arnesten@codekick.com>
wrote:
> >>> 
> >>> I tested to change the db_fragmentation to different levels. If I raise
it to 70% the compaction stops, but for 60% and lower it keeps running all the time. 
> >>> 
> >>> So there seems to be something weird with how CouchDB calculates the fragmentation
level. As I said, I have a large percentage of deleted documents in the database, so perhaps
it is not including them correctly in the calculation? It could definitely be near 70% of
the database size that is deleted documents.
> >>> 
> >>>> On Fri, Oct 4, 2013, at 10:17, Calle Arnesten wrote:
> >>>> Hi,
> >>>> 
> >>>> I recently upgraded from CouchDB 1.2 to 1.4. I have noticed that the
database compaction is running more or less all the time during the allowed compaction time.
Is there a known issue for this with 1.4?
> >>>> 
> >>>> The compaction is completed on each run and the reported database size
is smaller on the first run during the compaction time. But then it starts again for the same
database, and when completed, starts again, etc. It's like it thinks that the database is
still fragmented even if it's not.
> >>>> 
> >>>> The databases are quite large (~5GB), so it's not the case that many
documents have had time to change during the compaction time.
> >>>> 
> >>>> These are my settings:
> >>>> [{db_fragmentation, "20%"}, {view_fragmentation, "20%"}, {from, "03:00"},
{to, "11:00"}]
> >>>> 
> >>>> The harddrive is not full, it has about 70GB of free space. 
> >>>> 
> >>>> I have a large percentage of deleted documents, if that might be a reason
for the issue/bug. 
> >>>> 
> >>>> I don't have the same problem for view compaction.
> >>>> 
> >>>> Best regards
> >>>> Calle Arnesten
> >> 
> >> Email had 1 attachment:
> >> + signature.asc
> >>  1k (application/pgp-signature)

Mime
View raw message