jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Reutegger (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (OAK-5186) ChangeSetFilterImpl: support many includePaths by filtering for 1st path name
Date Wed, 14 Dec 2016 16:55:58 GMT

     [ https://issues.apache.org/jira/browse/OAK-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Marcel Reutegger resolved OAK-5186.
-----------------------------------
    Resolution: Fixed

Applied the patch and added a test to assess the performance impact. The time to run the test
loop went down from a couple of seconds to some tens of milliseconds on my machine.

In trunk: http://svn.apache.org/r1774292

> ChangeSetFilterImpl: support many includePaths by filtering for 1st path name
> -----------------------------------------------------------------------------
>
>                 Key: OAK-5186
>                 URL: https://issues.apache.org/jira/browse/OAK-5186
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.5.14
>            Reporter: Stefan Egli
>            Assignee: Marcel Reutegger
>            Priority: Critical
>             Fix For: 1.5.16, 1.6
>
>         Attachments: OAK-5186.patch
>
>
> When there is a large number of include paths in the ChangeSetFilterImpl and combine
that with a large-ish ChangeSet (many paths) then the comparison becomes expensive, as there
is a loop with each ChangeSet-path, then looping through each include path. Basically an {{O(n*m)}}.
> A probably ideal solution would be to implement a tree with the tree items be the path
elements. And have two sets of trees: the filter one and the ChangeSet one.
> A simpler and perhaps 'good enough' solution could be to just look at the first level
name of both the filter include paths: if a ChangeSet path's first level name is not in that
set, then it can't be included. That would allow to skip the pattern comparison (which is
slower even though it is a compiled {{Pattern}}).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message