jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Francesco Mari (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (OAK-7914) Cleanup updates the gc.log after a failed compaction
Date Tue, 04 Dec 2018 10:12:00 GMT

     [ https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Francesco Mari updated OAK-7914:
    Priority: Critical  (was: Major)

> Cleanup updates the gc.log after a failed compaction
> ----------------------------------------------------
>                 Key: OAK-7914
>                 URL: https://issues.apache.org/jira/browse/OAK-7914
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar
>            Reporter: Francesco Mari
>            Priority: Critical
>             Fix For: 1.10
>         Attachments: compaction.log
> The {{gc.log}} is always updated during the cleanup phase, regardless of the result of
the compaction phase. This might cause a scenario similar to the following.
> - A repository of 100GB, of which 40GB is garbage, is compacted.
> - The estimation phase decides it's OK to compact.
> - Compaction produces a new head state, adding another 60GB.
> - Compaction fails, maybe because of too many concurrent commits.
> - Cleanup removes the 60GB generated during compaction.
> - Cleanup adds an entry to the {{gc.log}} recording the current size of the repository,
> Now, let's imagine that compaction is run shortly after that. The amount of content added
to the repository is negligible. For the sake of simplicity, let's say that the size of the
repository hasn't changed. The following happens.
> - The repository is 100GB, of which 40GB is the same garbage that wasn't removed above.
> - The estimation phase decides it's not OK to compact, because the {{gc.log}} reports
that the latest known size of the repository is 100GB, and there is not enough content to
> This is in fact a bug, because there are 40GB worth of garbage in the repository, but
estimation is not able to see that anymore. The solution seems to be not to update the {{gc.log}}
if compaction fails. In other words, {{gc.log}} should contain the size of the *compacted*
repository over time, and no more.
> Thanks to [~rma61870@adobe.com] for reporting it.

This message was sent by Atlassian JIRA

View raw message