lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: [lucy-user] C library:Suggester
Date Tue, 02 May 2017 23:22:19 GMT
On Mon, May 1, 2017 at 3:55 PM, Serkan Mulayim <serkanmulayim@gmail.com> wrote:

> I am using the C library. I would like to get the suggester or autocomplete
> functionality in my library. It needs to return {"hello", "hell", "hellx"}
> when your query is "hell". I feel like I need to be able to read all the
> tokens in the whole index, and return the results based on it. I looked at
> the indexReader for this, but I could not find any useful information. Do
> you think this is possible?

Autosuggestion functionality will need tuning, just like search results.  In
fact, autosuggestion is really a specialized form of search application.  It
could be implemented with a separate index or separate fields.

Say that we only wanted to offer suggestions derived from the `title` field.
Split each title into an array of words.  Then for each word, index starting
at some letter, say the third.  For the title `hello world`, you'd get the
following tokens:

    hello -> hel hell hello
    world -> wor worl world

Then at search time, perform a search query with every keystroke.

    h -> (no result)
    he -> (no result)
    hel -> "hello world"

Once you've got basic functionality running, experiment with minimum token
length, adding Soundex/Metaphone, performing character normalization, etc.

Marvin Humphrey

Mime
View raw message