From dev-return-33250-apmail-jackrabbit-dev-archive=jackrabbit.apache.org@jackrabbit.apache.org Tue Nov 29 00:34:04 2011 Return-Path: X-Original-To: apmail-jackrabbit-dev-archive@www.apache.org Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8B3259B10 for ; Tue, 29 Nov 2011 00:34:04 +0000 (UTC) Received: (qmail 88649 invoked by uid 500); 29 Nov 2011 00:34:04 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 88595 invoked by uid 500); 29 Nov 2011 00:34:04 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 88588 invoked by uid 99); 29 Nov 2011 00:34:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Nov 2011 00:34:04 +0000 X-ASF-Spam-Status: No, hits=-2001.2 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Nov 2011 00:34:01 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 2913FA3591 for ; Tue, 29 Nov 2011 00:33:40 +0000 (UTC) Date: Tue, 29 Nov 2011 00:33:40 +0000 (UTC) From: "Alex Parvulescu (Updated) (JIRA)" To: dev@jackrabbit.apache.org Message-ID: <1747096339.20404.1322526820184.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <10520268.5224.1298993616946.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (JCR-2906) Multivalued property sorted by last/random value MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/JCR-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Parvulescu updated JCR-2906: --------------------------------- Attachment: JCR-2906-v3.patch Attaching v3 of the patch. I've taken into consideration the point about performance (creating and compacting the arrays) and I've come up with a better offset-based version: It is optimized for the standard case when there is only one term to create a single-element array & save the term's position in the index, as opposed to creating an array of considerable size. So no more shrink method. Next, based on the following terms' index, it uses an internal offset to optimize the creation of the array of terms. This should have a considerably smaller impact on perf. Thanks for the feedback so far. > Multivalued property sorted by last/random value > ------------------------------------------------ > > Key: JCR-2906 > URL: https://issues.apache.org/jira/browse/JCR-2906 > Project: Jackrabbit Content Repository > Issue Type: Improvement > Components: indexing > Affects Versions: 2.2 > Environment: Windows 7, Sun JDK 1.6.0_23 > Reporter: Paul Lysak > Labels: multivalued, sort > Attachments: JCR-2906-SharedFieldCache.patch, JCR-2906-v2.patch, JCR-2906-v3.patch, JCR-2906.patch > > > Sorting on multivalued property may produce incorrect result because sorting is performed only by last value of multivalued property. > Steps to reproduce: > 1. Create multivalued field in repository. Example from nodetypes file: > onParentVersion="COPY" protected="false" multiple="false"> > 2. Create few records so that all records except one would contain single value for MyProperty and one record would contain > first value which is greater then of any other record and the second value is somewhere in the middle. Here is an example: > 1st record: "aaaa" > 2nd record: "cccc" > 3rd record: "dddd", "bbbb" > 3. Run some query which sorts Example of XPath query: > //*[...here are some criteria...] order by @MyProperty ascending > The query would return documents in such order: > "aaaa" > "dddd", "bbbb" > "cccc" > which is not expected order (expected same order as they were entered - as "aaaa" < "cccc", "cccc" < "dddd") > After some digging I found out that it happens because method > org.apache.jackrabbit.core.query.lucene.SharedFieldCache.getValueIndex > (called from org.apache.jackrabbit.core.query.lucene.SharedFieldSortComparator.SimpleScoreDocComparator constructor) > returns only last Comparable of the document. Here is overwrites previous value: > retArray[termDocs.doc()] = getValue(value, type); > I tried to concatenate comparables (just to check if it would work for my case): > if(retArray[termDocs.doc()] == null) { > retArray[termDocs.doc()] = getValue(value, type); > } else { > retArray[termDocs.doc()] = > retArray[termDocs.doc()] + " " + getValue(value, type); > } > But it didn't worked well either - TermEnum returns terms not in the same order as JackRabbit returns values of multivalued field > (as an example ["qwer", "asdf"] may become ["asdf", "qwer"] ). So, simple concatenation doesn't help. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira