jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Parvulescu (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (JCR-2906) Multivalued property sorted by last/random value
Date Tue, 29 Nov 2011 00:33:40 GMT

     [ https://issues.apache.org/jira/browse/JCR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Alex Parvulescu updated JCR-2906:

    Attachment: JCR-2906-v3.patch

Attaching v3 of the patch.

I've taken into consideration the point about performance (creating and compacting the arrays)
and I've come up with a better offset-based version:

It is optimized for the standard case when there is only one term to create a single-element
array & save the term's position in the index, as opposed to creating an array of considerable
size. So no more shrink method.

Next, based on the following terms' index, it uses an internal offset to optimize the creation
of the array of terms.

This should have a considerably smaller impact on perf.

Thanks for the feedback so far.

> Multivalued property sorted by last/random value
> ------------------------------------------------
>                 Key: JCR-2906
>                 URL: https://issues.apache.org/jira/browse/JCR-2906
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 2.2
>         Environment: Windows 7, Sun JDK 1.6.0_23
>            Reporter: Paul Lysak
>              Labels: multivalued, sort
>         Attachments: JCR-2906-SharedFieldCache.patch, JCR-2906-v2.patch, JCR-2906-v3.patch,
> Sorting on multivalued property may produce incorrect result because sorting is performed
only by last value of multivalued property.
> Steps to reproduce:
> 1. Create multivalued field in repository. Example from nodetypes file:
> <propertyDefinition name="MyProperty" requiredType="String" autoCreated="false" mandatory="false"
>    onParentVersion="COPY" protected="false" multiple="false">
> 2. Create few records so that all records except one would contain single value for MyProperty
and one record would contain 
> first value which is greater then of any other record and the second value is somewhere
in the middle. Here is an example:
> 1st record: "aaaa"
> 2nd record: "cccc"
> 3rd record: "dddd", "bbbb"
> 3. Run some query which sorts Example of XPath query:
> //*[...here are some criteria...] order by @MyProperty ascending
> The query would return documents in such order:
> "aaaa"
> "dddd", "bbbb"
> "cccc"
> which is not expected order (expected same order as they were entered - as "aaaa" <
"cccc", "cccc" < "dddd")
> After some digging I found out that it happens because method 
> org.apache.jackrabbit.core.query.lucene.SharedFieldCache.getValueIndex
> (called from org.apache.jackrabbit.core.query.lucene.SharedFieldSortComparator.SimpleScoreDocComparator
> returns only last Comparable of the document. Here is overwrites previous value:
> retArray[termDocs.doc()] = getValue(value, type);
> I tried to concatenate comparables (just to check if it would work for my case):
> 	if(retArray[termDocs.doc()] == null) {
> 		retArray[termDocs.doc()] = getValue(value, type);
> 	} else {
> 		retArray[termDocs.doc()] =
> 				retArray[termDocs.doc()] + " " + getValue(value, type);
> 	}
> But it didn't worked well either - TermEnum returns terms not in the same order as JackRabbit
returns values of multivalued field
> (as an example ["qwer", "asdf"] may become ["asdf", "qwer"] ). So, simple concatenation
doesn't help.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message