lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-7537) Add multi valued field support to index sorting
Date Fri, 11 Nov 2016 18:30:59 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15657742#comment-15657742
] 

Michael McCandless commented on LUCENE-7537:
--------------------------------------------

Thanks [~jim.ferenczi] this looks cleaner!

In {{Lucene62SegmentInfoFormat.java}}, when we throw
{{IllegalArgumentException}}, can we change it to include the sortField,
not just its .getType()?

I think you need to fix {{SimpleTextCodec}} too?  I hit this failure:

{noformat}
   [junit4] Suite: org.apache.lucene.codecs.simpletext.TestSimpleTextSegmentInfoFormat
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestSimpleTextSegmentInfoFormat
-Dtests.method=testSort -Dtests.seed=61D2298FBC9DEB3E -Dtests.locale=el-CY -Dtests.timezone=America/Indiana/Petersburg
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.01s J0 | TestSimpleTextSegmentInfoFormat.testSort <<<
   [junit4]    > Throwable #1: java.lang.IllegalStateException: Unexpected sort type: CUSTOM
   [junit4]    > 	at __randomizedtesting.SeedInfo.seed([61D2298FBC9DEB3E:303701A677D0F12E]:0)
   [junit4]    > 	at org.apache.lucene.codecs.simpletext.SimpleTextSegmentInfoFormat.write(SimpleTextSegmentInfoFormat.java:373)
   [junit4]    > 	at org.apache.lucene.index.BaseSegmentInfoFormatTestCase.testSort(BaseSegmentInfoFormatTestCase.java:268)
   [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> NOTE: leaving temporary files on disk at: /l/jim/lucene/build/codecs/test/J0/temp/lucene.codecs.simpletext.TestSimpleTextSegmentInfoFormat_61D22
{noformat}

The new {{CorruptIndexException}} s thrown in
{{Lucene62SegmentInfoFormat.java}} have the wrong message I think?
Shouldn't it be something like {{"invalid SortedSetSelector type: " + type}} ?

Can you bump the version value in {{Lucene62SegmentInfoFormat.java}},
and set {{VERSION_CURRENT}} to the new version?  We need to do this
when we make any index format change so that if e.g. and old Lucene
version tries to read a newer index (written with this change) they
see an {{IndexFormatTooNewException}} and not
{{CorruptIndexException}}.



> Add multi valued field support to index sorting
> -----------------------------------------------
>
>                 Key: LUCENE-7537
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7537
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Ferenczi Jim
>         Attachments: LUCENE-7537.patch, LUCENE-7537.patch
>
>
> Today index sorting can be done on single valued field through the NumericDocValues (for
numerics) and SortedDocValues (for strings).
> I'd like to add the ability to sort on multi valued fields. Since index sorting does
not accept custom comparator we could just take the minimum value of each document for an
ascending sort and the maximum value for a descending sort.
> This way we could handle all cases instead of throwing an exception during a merge when
we encounter a multi valued DVs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message