lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: [lucy-user] Synonyms with Lucy
Date Thu, 04 Jul 2013 22:52:38 GMT
On Thu, Jul 4, 2013 at 3:09 PM, Nick Wellnhofer <wellnhofer@aevum.de> wrote:
> It's easy to implement user-specified synonyms with a custom Analyzer. All
> you have to do is to map tokens to a synonym with a hash table. You can find
> some information on how to implement your own Analyzer in the mailing list
> archives.

The (non-public) Token class's position increment is also designed to support
multiple terms at the same position.  It defaults to 1, but if you set it to
0, the next term gets put at the same position.

The advantage of handling the synonym expansion at index time is simplified
queries and streamlined performance at search-time.

> Lucy's SnowballStopFilter already supports custom stoplists and could be
> leveraged to map synonyms with just a few changes. What do the Lucy
> developers think about supporting synonyms in core?

I wish that we had completed compiled extension support by now.  This is the
kind of thing that it would be nice to see mature as a separately developed
extension under a different namespace, possibly going through multiple
iterations of API and implementation before taking on the backwards
compatibility burden that comes with putting something in core.

Since we're not there yet, I could see putting something under LucyX.

Marvin Humphrey

Mime
View raw message