lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: how to remove the dash
Date Mon, 25 Jun 2012 15:10:38 GMT
My apologies - you are right.

With both ClassicAnalyzer and StandardAnalyzer, "drinks - water" comes
out as "drinks -water" whereas "drinks-water" comes out as "drinks
water", as I'd expected.

I guess this is fixable in JFlex, or I think there is some replace
tokenizer somewhere that can replace character X with character Y e.g.
"-" with " ".  Or pre-process your text/queries with a regexp.  Maybe
someone else has better ideas.


--
Ian.


On Mon, Jun 25, 2012 at 3:35 PM,  <listas@alphamatrix.org> wrote:
> As I said i've tried with StandardAnalyzer(without changes) and
> others(WhitespaceAnalyzer, SimpleAnalyzer, StopAnalyzer).
> Now i've tried with ClassicAnalyzer as well... same result.
>
> Code:
>  ClassicAnalyzer analyzer = new ClassicAnalyzer(Version.LUCENE_36);
>  QueryParser parser = new QueryParser(Version.LUCENE_36, "contents",
> analyzer);
>  Query query = parser.parse(food);
>  System.out.println("Query: " + query.toString("contents"));
>  TopDocs results = searcher.search(query, 10);
>
> Thanks
> xpete
>
> A Segunda, 25 de Junho de 2012 14:37:37 Ian Lea escreveu:
>> I'm positive that StandardAnalyzer won't change "drinks - water" to
>> "drinks -water".  So it must be something in your code.  Which you
>> don't show us.  Best guess is that the changes you've made to the
> Flex
>> file have caused the problem.  If you created your tokenizer by
>> copying and modifying StandardTokenizer you could start again or do
> a
>> diff or something.  Good luck.
>>
>>
>> --
>> Ian.
>>
>> On Mon, Jun 25, 2012 at 2:40 AM,  <listas@alphamatrix.org> wrote:
>> > hi
>> >
>> > I have strings like "drinks - water" and I've read in "Lucene in
> Action"
>> > that
>> > the StandardAnalyzer and other analyzers removes the "-" from the
> string
>> > but so far none of them worked... All of them change my string to
>> > something
>> > like
>> > "drinks -water" so the "-" is used as an "prohibit operator" and this
> is a
>> > BIG problem for me.
>> >
>> > I'm using Lucene 3.6.
>> > I'am also using my own Analyzer, Filters and a Tokenizer based on
>> > StandardTokenizer with changes
>> > on the Flex file to remove some othe stuff.
>> >
>> > How can i remove the "-"?
>> >
>> > Thanks
>> > xpete
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-user-
> help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message