lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: QueryParser in 3.x
Date Fri, 17 Sep 2010 07:02:43 GMT
On Fri, Sep 17, 2010 at 1:06 AM, Scott Smith <ssmith@mainstreamdata.com> wrote:
> I recently upgraded to Lucene 3.0 and am seeing some new behavior that I don't understand.
 Perhaps someone can explain why.
>
>
>
> I have a custom analyzer.  Part of the analyzer uses the AsciiFoldingFilter.  If I
run a word with an umlaut through that analyzer using the AnalyzerDemo code in LIA2, as expected,
I get the same word except that the umlauted letter is now a simple ascii letter (no umlaut).
 That's what I would expect and want.
>
>
>
> If I create a Queryparser using the call "new QueryParser(LUCENE_30, "body", myAnalyzer)
and then call the parse() method passing the same word, I can see that the query parser has
not removed the umlaut.  The string it has is "+body: Europabörsen".
>
This seems to be an issue with your analyzer rather than with the
QueryParser. Since QueryParser didn't really change its behavior in
3.0 except of some default values. Can you provide more info what you
did with your analyzer? Did you try running the term with umlaut chars
through your Analyzer / Tokenstream directly? Something like that:

Analyzer a = new MyAnalyzer();
TokenStream stream = a.reusableTokenStream("body", new
StringReader("Europabörsen"));
TermAttribute attr = stream.addAttribute(TermAttribute.class);
while(stream.incrementToken())
  System.out.println(attr.term());

simon
>
>
> I know I had to make a number of changes to the analyzer and the tokenizer to upgrade
to 3.x.  Is there something very different from the 2.x version that I'm likely missing.
>
>
>
> Anyone have any thoughts?
>
>
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message