lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: InvalidTokenOffsetsException in conjunction with highlighting and ICU folding and edgeNgrams
Date Mon, 12 Dec 2011 12:21:23 GMT
On Mon, Dec 12, 2011 at 5:18 AM, Max <nash12@gmail.com> wrote:

> The end offset remains 11 even after folding and transforming "æ" to
> "ae", which seems wrong to me.

End offsets refer to the *original text* so this is correct.

What is wrong, is EdgeNGramsFilter. See how it turns that 11 to a 12?

>
> I also stumbled upon https://issues.apache.org/jira/browse/LUCENE-1500
> which seems like a similiar issue.
>
> Is there a workaround for that problem or is the field configuration wrong?

For now, don't use EdgeNGrams.

-- 
lucidimagination.com

Mime
View raw message