jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Klimetschek <aklim...@adobe.com>
Subject Re: AutoCompelete
Date Thu, 25 Nov 2010 08:48:53 GMT
On 24.11.10 22:29, "Ard Schrijvers" <a.schrijvers@onehippo.com> wrote:

>On Wed, Nov 24, 2010 at 10:03 PM, Zhou Wu <zwu_ca@yahoo.com> wrote:
>> I'm trying to do some thing like
>> org.apache.jackrabbit.core.query.lucene.spell.SpellChecker for
>>autocomplete:
>> When user type in the search input box, a list of words (phrases) that
>>pops
>> up like Google suggestion.  I searched on the web and got
>> 
>>http://stackoverflow.com/questions/120180/how-to-do-query-auto-completion
>>-suggestions-in-lucene
>> that looks like helpful. But I don't know how to start to get it work
>>with
>> Jackrabbit. Could any one give some tips? Thanks,
>
>Afaiu, Spellchecker wouldn't fit auto completion. Auto completion is
>about suggesting existing terms in the index after you typed, say
>'jack'.

Exactly, spellcheck is about getting from "jeck" to "jack", but
autocompletion (in its hardest form) is about getting from typing an "j"
to a list like "jack, jupiter, jelly, january, ...".

Also there are different use cases as what to show in auto-completion
(always showing all possibilities doesn't work ;-)) and it is language-
and region dependent.

Since those few-letter inputs like "j" will be the most frequent ones, as
people are typing words one-by-one, you want to directly lookup those
terms from a pre-built index as directly as possible. For this, you can
have something like "j/ja/jac" in the repository. On each level there is a
multi-value property containing the auto-completions/suggestions you want
to show (10 is a good number for example, used by google).

How this index is built in the first time, depends on the use case. For
example, the Google search shows you terms that are currently popular, so
they probably update that index based on query statistics like one or two
times a day. To start, you can use a dictionary, filter out stop words
like "the", "and" etc. and build that index automatically. Then you only
get single words - Google also shows full searches, like "jack wolfskin".
And there are probably many other sources you can build such an index from.

Hope that helps,
Alex

-- 
Alexander Klimetschek
Developer // Adobe (Day) // Berlin - Basel





Mime
View raw message