lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From karl wettin <karl.wet...@gmail.com>
Subject Re: autocomplete with multiple terms
Date Thu, 22 Feb 2007 10:14:58 GMT

22 feb 2007 kl. 10.09 skrev Martin Braun:

> the only thing I have found in the list before concerning this subject
> is http://issues.apache.org/jira/browse/LUCENE-625, but I'm not  
> sure if
> it does the things I want.


> I am not sure if we get enough queries for a search over an index base
> on the user-queries.

If the content of your corpus is static enough, then time is the  
friend that will enable you gather enough user queries to build the  
suggestion data set.

Otherwise you have to produce simulated user queries by reducing your  
data set to the most common information. Perhaps using Markov chains,  
top n paths of terms with Dijkstra or so could be an easy way out.  
You can also start looking at the documents people choose to inspect,  
and use these as the base for phrase training.

I think you will get further considering this from a behavioral  
psychology angle rather than how to access the  corpus access  
problem. Also, navigating a reduced data set (such as the trie in  
LUCENE-625 compared to the corpus it suggests to) will save you a lot  
of system resources.

Hope this helps some.

-- 
karl





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message