lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hans meiser <>
Subject Re: whats the correct way to do normalisation?
Date Mon, 06 Nov 2006 16:27:41 GMT
  > Did you take a look at IsoLatin1AccentFilter ?
  It nearly do the same i need, but not perfectly.
   public final Token next() throws {
 final Token t =;
   if (t == null)
   return null;   
 return new Token(removeAccents(t.termText()), t.startOffset(), t.endOffset(), t.type());
  Here also a new Token is created. The question i have, why the endoffset is not
  corrected for the new created token? Some times the new token is bigger than before.
  Complete code link:


Keine Lust auf Tippen? Rufen Sie Ihre Freunde einfach an.
  Yahoo! Messenger. Jetzt installieren . 
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message