lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: EdgeNgramTokenFilter and positions
Date Fri, 07 Sep 2012 00:51:28 GMT
I don't know for sure, but I remember something around this being a problem, yes ... maybe https://issues.apache.org/jira/browse/LUCENE-3907 ?

Otis 
----
Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm 



----- Original Message -----
> From: Walter Underwood <wunder@chegg.com>
> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Cc: 
> Sent: Wednesday, September 5, 2012 1:51 PM
> Subject: EdgeNgramTokenFilter and positions
> 
> In the analysis page, the n-grams produced by EdgeNgramTokenFilter are at 
> sequential positions. This seems wrong, because an n-gram is associated with a 
> source token at a specific position. It also really messes up phrase matches.
> 
> With the source text "fleen", these positions and tokens are 
> generated:
> 
> 1,fl
> 2,fle
> 3,flee
> 4,fleen
> 
> Is this a known bug? Fixed? I'm running 3.3.
> 
> wunder
> --
> Walter Underwood
> Search Guy
> wunder@chegg.com<mailto:wunder@chegg.com>
> 

Mime
View raw message