lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven A Rowe <sar...@syr.edu>
Subject RE: how to remove the dash
Date Mon, 25 Jun 2012 19:28:06 GMT
I added the following to both TestStandardAnalyzer and TestClassicAnalyzer in branches/lucene_solr_3_6/,
and it passed in both cases:

  public void testWhitespaceHyphenWhitespace() throws Exception {
    BaseTokenStreamTestCase.assertAnalyzesTo
      (a, "drinks - water", new String[]{"drinks", "water"});
  }

So I'm not seeing the same behavior as you guys - the hyphen is not part of any emitted token.

Steve

-----Original Message-----
From: listas@alphamatrix.org [mailto:listas@alphamatrix.org] 
Sent: Monday, June 25, 2012 11:33 AM
To: java-user@lucene.apache.org
Subject: Re: how to remove the dash

A Segunda, 25 de Junho de 2012 16:10:38 Ian Lea escreveu:
> My apologies - you are right.
> 
> With both ClassicAnalyzer and StandardAnalyzer, "drinks - water" 
comes
> out as "drinks -water" whereas "drinks-water" comes out as "drinks 
> water", as I'd expected.
> 
> I guess this is fixable in JFlex, or I think there is some replace 
> tokenizer somewhere that can replace character X with character Y
e.g.
> "-" with " ".  Or pre-process your text/queries with a regexp.  Maybe 
> someone else has better ideas.

I guess the same... I'am already using my own Tokenizer(based on
StandardTokenizer) to mark some strings for replacement or removal and i'am using a a filter
to replace them and the filter to remove... And tried to do that with the "-" but didn't worked...
I can't even mark the "-".
I'am avoiding pre-process...
I'am hoping that somebody could tell what can I change on StandardTokenizer JFlex to changes
this behavior.

Thanks

> 
> 
> --
> Ian.




> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message