jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property
Date Mon, 27 Aug 2007 15:49:30 GMT

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523033

Ard Schrijvers commented on JCR-1064:

Aaah I am sorry for the system.out. I replaced a patch and did put a sysout.out. Stupid, I'll
remove it! I'll do the other 4 (-) as well.

About the parent handler I knew that the system index can be in the old format, but AFAICS,
this is never an issue. When I am searching an index for workspace X, it does not matter wether
the parent index is in the old format I think (I am doing the tests, with the parent index
in old format, and the workspace index in new format, and this is no problem)

As I see your example:

A user may do the following:
- Upgrate a pre 1.4 repository (-> all indexes are V1)
- Re-index a workspace (-> workspace index will be V2)
- Execute a query on the workspace (-> will use V2 for queries) 

this will just run fine, as I tested it this way. You can have workspaces with old index style
along with new index style, as with a system index in new or old format. 

It is hard to get it nice backwards compatible, due to the index creation in the MultiIndex
when there is no index.

For example, when in SearchIndex.doInit() the following line is executed

index = new MultiIndex(indexDir, this, context.getItemStateManager(),
                context.getRootId(), excludedIDs, nsMappings);

the system index is created. Because this is *before* the setIndexFormatVersion part in doInit(),
in NodeIndexer this part

if(indexFormatVersion == IndexFormatVersion.V2) {

will never be called since indexFormatVersion  == null. This means, the system index is always
indexed without the PROPERTIES_SET, and therefor always in the old format. 

Now, I did just test to first set the default indexformat before the new MultiIndex, like:

index = new MultiIndex(indexDir, this, context.getItemStateManager(),
               context.getRootId(), excludedIDs, nsMappings);

which later in doInit might be set to V1

so when a new index is created here, I get an index with the PROPERTIES_SET. But...I do not
know wether the new MultiIndex(...) creation also indexes after it already exists, so that
it might index  PROPERTIES_SET, while it should be in old format. Hope I am a little clear
on the problems? :-)

I'll re-add the patch with your first 4 (-)  solved and wait if you can comment on my thing
about the parent handler,

thanks for reviewing :-) 

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery,
that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer
 does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and
fall back to the original implementation

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message