jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller (Jira)" <j...@apache.org>
Subject [jira] [Commented] (OAK-9052) Reindexing using --doc-traversal-mode may need a lot of memory
Date Wed, 06 May 2020 14:03:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100829#comment-17100829
] 

Thomas Mueller commented on OAK-9052:
-------------------------------------

https://github.com/oak-indexing/jackrabbit-oak/pull/154

With the memory setting "0" (the default value), a temporary file is created for the linked
list, so that heap memory usage is constant (around 30 MB I guess). Internally, a persistent
key-value store, the H2 MVStore, is used (the same one as used by the MongoMK for the persistent
cache). Every minute, the file is compacted (configurable using the "oak.indexer.linkedList.compactMillis"
system property)

It's possible to use the old behavior by setting the system property "oak.indexer.memLimitInMB"
to 100.

> Reindexing using --doc-traversal-mode may need a lot of memory
> --------------------------------------------------------------
>
>                 Key: OAK-9052
>                 URL: https://issues.apache.org/jira/browse/OAK-9052
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: indexing, mongomk
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>            Priority: Major
>
> Indexing using oak-run and --doc-traversal-mode uses the FlatFileStore. For aggregation,
there is a limit on memory usage, by default around 100 MB. Depending on the content structure,
this limit can be exceeded. 
> It would be good to find a way to avoid a memory limit, for example using a temporary
storage (a file, or a persistent key/value store).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message