lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wun...@chegg.com>
Subject EdgeNgramTokenFilter and positions
Date Wed, 05 Sep 2012 17:51:06 GMT
In the analysis page, the n-grams produced by EdgeNgramTokenFilter are at sequential positions.
This seems wrong, because an n-gram is associated with a source token at a specific position.
It also really messes up phrase matches.

With the source text "fleen", these positions and tokens are generated:

1,fl
2,fle
3,flee
4,fleen

Is this a known bug? Fixed? I'm running 3.3.

wunder
--
Walter Underwood
Search Guy
wunder@chegg.com<mailto:wunder@chegg.com>




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message