lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oliver Christ" <ochr...@ebscohost.com>
Subject WFST/Analyzing Suggesters: foreign keys, user-supplied filter, highlighting
Date Tue, 30 Oct 2012 13:40:16 GMT
Hi,

 

I'm currently researching using a WFST suggester on e.g. book titles.
While our basic use cases are well covered, there seem to be at least
three which aren't: 

 

*         The possibility to associate a "foreign key" with a string
(rather: final node) in the WFST (in addition to the rank). For example,
I'd like to add "Lucene in Action" with key 1933988177 (the ISBN) and
some rank to the WFST. A completion would return the completed string
and the key associated with each entry (i.e. final nodes get a "key"
field (int), which is returned in the LookupResult). That foreign key
could also be used for fast de-duping (no more string/byte array
comparisons).

 

*         When looking up completions, I'd like to be able to specify a
filter which further determines whether some completion should be
considered or not. Assume, for example, that I'm only interested in
computer science books, but can't maintain separate WFSTs for each
subject area. Given some completion candidate (represented by its key),
the filter would be called (with the key as a parameter) to determine
whether or not the completion candidate should be added to the result
queue. 

 

*         Highlighting of the completed portions (i.e. explicit markup
of user-provided vs. auto-completed portions of a completion).

 

What's your take on the above? What would be the best way to achieve
this? We want to use AnalyzingSuggester, so the above applies
particularly to them. 

 

My current research indicates the following:

 

*         There may be workarounds for the "foreign key" use case -it
seems that lots of data structures would be affected by storing a
user-provided key with final nodes, which therefore may not be a viable
path. It may be possible to encode the foreign key in the transducer's
output instead.

*         Adding a filter/predicate to the AnalyzingSuggester is simple,
as TopNSearcher<> already uses acceptResult() to test whether some
completion should be added - that can be overridden in a derived
searcher class which simply calls the predicate. Ideally the suggesters
would access some kind of factory to instantiate the searcher to be used
(instead of hardwiring it in).

*         I haven't found a simple solution for the highlighting yet,
particularly when using AnalyzingSuggester (where it's non-trivial). 

 

Thanks a lot!

 

Cheers, Oli

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message