lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Commented: (LUCENE-2554) preflex codec doesn't order terms correctly
Date Thu, 22 Jul 2010 22:21:50 GMT


Robert Muir commented on LUCENE-2554:

the perf issues here are really from our contrived tests... its good to use _TestUtil.randomUnicodeString,
but it gives you the impression there is something wrong with this dance and there really

I added _TestUtil.randomRealisticUnicodeString in r966878, you can swap this into some of
these slow tests and see its definitely the problem.

> preflex codec doesn't order terms correctly
> -------------------------------------------
>                 Key: LUCENE-2554
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Test
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>         Attachments: LUCENE-2554.patch
> The surrogate dance in the preflex codec (which must dynamically remap terms from UTF16
order to unicode code point order) is buggy.
> To better test it, I want to add a test-only codec, preflexrw, that is able to write
indices in the pre-flex format.  Then we should also fix tests to randomly pick codecs (including
preflexrw) so we better test all of our codecs.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message