My apologies - you are right.
With both ClassicAnalyzer and StandardAnalyzer, "drinks - water" comes
out as "drinks -water" whereas "drinks-water" comes out as "drinks
water", as I'd expected.
I guess this is fixable in JFlex, or I think there is some replace
tokenizer somewhere that can replace character X with character Y e.g.
"-" with " ". Or pre-process your text/queries with a regexp. Maybe
someone else has better ideas.
--
Ian.
On Mon, Jun 25, 2012 at 3:35 PM, <listas@alphamatrix.org> wrote:
> As I said i've tried with StandardAnalyzer(without changes) and
> others(WhitespaceAnalyzer, SimpleAnalyzer, StopAnalyzer).
> Now i've tried with ClassicAnalyzer as well... same result.
>
> Code:
> ClassicAnalyzer analyzer = new ClassicAnalyzer(Version.LUCENE_36);
> QueryParser parser = new QueryParser(Version.LUCENE_36, "contents",
> analyzer);
> Query query = parser.parse(food);
> System.out.println("Query: " + query.toString("contents"));
> TopDocs results = searcher.search(query, 10);
>
> Thanks
> xpete
>
> A Segunda, 25 de Junho de 2012 14:37:37 Ian Lea escreveu:
>> I'm positive that StandardAnalyzer won't change "drinks - water" to
>> "drinks -water". So it must be something in your code. Which you
>> don't show us. Best guess is that the changes you've made to the
> Flex
>> file have caused the problem. If you created your tokenizer by
>> copying and modifying StandardTokenizer you could start again or do
> a
>> diff or something. Good luck.
>>
>>
>> --
>> Ian.
>>
>> On Mon, Jun 25, 2012 at 2:40 AM, <listas@alphamatrix.org> wrote:
>> > hi
>> >
>> > I have strings like "drinks - water" and I've read in "Lucene in
> Action"
>> > that
>> > the StandardAnalyzer and other analyzers removes the "-" from the
> string
>> > but so far none of them worked... All of them change my string to
>> > something
>> > like
>> > "drinks -water" so the "-" is used as an "prohibit operator" and this
> is a
>> > BIG problem for me.
>> >
>> > I'm using Lucene 3.6.
>> > I'am also using my own Analyzer, Filters and a Tokenizer based on
>> > StandardTokenizer with changes
>> > on the Flex file to remove some othe stuff.
>> >
>> > How can i remove the "-"?
>> >
>> > Thanks
>> > xpete
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-user-
> help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|