lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sarah Hunter" <corian...@gmail.com>
Subject Re: StandardAnalyzer Problem with Apostrophes
Date Tue, 14 Nov 2006 15:51:50 GMT
That was my first thought as well, but it looks like APOSTROPHE is
already the one that I want. As you can see, from StandardAnalyzer.jj

-------------------
TOKEN : {					  // token patterns

  // basic word: a sequence of digits & letters
  <ALPHANUM: (<LETTER>|<DIGIT>|<KOREAN>)+ >

  // internal apostrophes: O'Reilly, you're, O'Reilly's
  // use a post-filter to remove possesives
| <APOSTROPHE: <ALPHA> ("'" <ALPHA>)+ >
-------------------

It really looks like it should work for ' rather than `, but it does not.

Thanks for the reply! Hopefully you or someone else can point out
what's going on or where I'm going wrong.
Sarah

On 11/14/06, Karel Tejnora <karel@tejnora.cz> wrote:
> Apostrophe is recognized as a part of word - Standard analyzer is mostly
> English oriented.
> The way is to swap apostrophes - "normal" with unusual.
>
> StandardAnalyzer.java line 40-44
>
> APOSTROPHE:
>       token = jj_consume_token(APOSTROPHE);
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message