jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Parvulescu (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905
Date Wed, 30 Nov 2011 16:09:39 GMT

    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160113#comment-13160113
] 

Alex Parvulescu commented on JCR-3162:
--------------------------------------

Studying this problem revealed that this issue happens whenever we are dealing with a cluster
sync operation involving an instance that has been running for a really long time.

At this point I'm not sure what really long time means exactly, but it would appear that after
a while the journal revision resets to 0.
This causes the cluster slave to sync using a lower revision number, thus fetching the journal
records again, which would determine the repository to index them again.
If the current index corresponds to a bigger revision number, re-indexing again means that
there will be duplicates in the index.

JCR-905 tried to address that by first deleting all the records that come from an external
source (the cluster sync) before adding them.

The proposed solution tries to determine on repository startup if the index is stale and tries
to force a full reindex by deleting it.
Index staleness is currently determined by checking if journal revision is 0 and if there
are already index files present in the repository.

Interestingly this happens a lot during tests when the index is conserved from one restart
to the other, but the journal impl is memory based so it gets reset every time.

The solution has some issues because of the asynchronous initialization of SearchIndex for
workspaces other than "default". Meaning that by the time the SearchIndex gets initialized,
the cluster node has already sync'ed to a bigger revision than 0, even if it was 0 at the
moment when the repo was starting up.
But this doesn't apply to the default workspace.

                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it
processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message