couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Kocoloski (JIRA)" <j...@apache.org>
Subject [jira] Updated: (COUCHDB-888) out of memory crash when compacting document with lots of edit branches
Date Fri, 17 Sep 2010 14:23:33 GMT

     [ https://issues.apache.org/jira/browse/COUCHDB-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adam Kocoloski updated COUCHDB-888:
-----------------------------------

    Attachment: 0001-fix-OOME-when-compacting-a-DB-with-many-edit-conflic.patch

This patch against trunk should fix the problem.

I realized the map traversal itself was not a major issue since the revision tree is mapped
every time a document is read or written.  I figured the problem must be the specific map
function used in a traversal in the compactor code.  I looked at copy_rev_tree_attachments
and realized that the compactor loaded document bodies for every leaf of 1000 documents into
memory simultaneously.  When there are no edit conflicts this is fine, but if each document
has ~100 conflicts we are effectively loading 100k document bodies into memory.

A BigCouch version of this patch was able to compact our problem database with no appreciable
memory usage.  `make check` and the compact portion of the Futon suite pass.

There should be no difference in indexing performance for databases without attachments. 
I haven't tested the effect of eliminating the "contiguous document bodies" optimization on
the indexing time of a database with lots of attachments.  If it turns out to be a big regression
we could consider tracking the total number of document bodies (including conflicts) in the
accumulator that determines when to flush to disk.  However, I think this version is quite
a bit simpler, so I'd want to see a benchmark that proves we really have something to gain
there.


> out of memory crash when compacting document with lots of edit branches
> -----------------------------------------------------------------------
>
>                 Key: COUCHDB-888
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-888
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>            Reporter: Adam Kocoloski
>         Attachments: 0001-fix-OOME-when-compacting-a-DB-with-many-edit-conflic.patch,
key_tree_backtrace.txt.gz
>
>
> I have a database which will crash CouchDB if I try to compact it.  It causes beam.smp
to use all the memory on the server.  I caught it in the act one time and sorted the Erlang
processes by memory usage.  The process spawned to do the compaction turned out to be the
culprit.  I took a backtrace of the process and found that it was mapping a very large revision
tree.  I have reason to believe that the document has a large number (~1000s) of edit conflicts.
> I think part of the problem may be that the recursion in couch_key_tree:map_simple requires
each stack space for every iteration.  I'm not sure if it's possible to rewrite the algorithm
in a more memory-friendly way given the current tree structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message