lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (LUCENE-2074) Use a separate JFlex generated Unicode 4 by Java 5 compatible StandardTokenizer
Date Mon, 16 Nov 2009 22:03:39 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778582#action_12778582
] 

Uwe Schindler edited comment on LUCENE-2074 at 11/16/09 10:01 PM:
------------------------------------------------------------------

bq. Uwe, also, just checking, i don't know javacc at all, does it use unicode properties?
We have a lot of queryparsers out there... 

I do not know it, too :)

The only query parser using jflex is the new one. And the new one should normally use no unicode
properties. Can you check the JFlex file? All other query parsers use JavaCC

*edit*

No Jflex is used by any query parser. But WikipediaTokenizer uses JFlex...

      was (Author: thetaphi):
    bq. Uwe, also, just checking, i don't know javacc at all, does it use unicode properties?
We have a lot of queryparsers out there... 

I do not know it, too :)

The only query parser using jflex is the new one. And the new one should normally use no unicode
properties. Can you check the JFlex file? All other query parsers use JavaCC.
  
> Use a separate JFlex generated Unicode 4 by Java 5 compatible StandardTokenizer
> -------------------------------------------------------------------------------
>
>                 Key: LUCENE-2074
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2074
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 3.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.0
>
>         Attachments: jflexwarning.patch, LUCENE-2074.patch, LUCENE-2074.patch
>
>
> The current trunk version of StandardTokenizerImpl was generated by Java 1.4 (according
to the warning). In Java 3.0 we switch to Java 1.5, so we should regenerate the file.
> After regeneration the Tokenizer behaves different for some characters. Because of that
we should only use the new TokenizerImpl when Version.LUCENE_30 is used as matchVersion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message