lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dror Matalon <>
Subject Re: Query expansion
Date Thu, 18 Dec 2003 18:38:37 GMT
On Thu, Dec 18, 2003 at 05:59:34PM +0100, Viparthi, Kiran (AFIS) wrote:
> We want to provide "did you mean" search suggestions on our search results
> pages. Most of the "did you mean" searches will be derived from synonyms,
> translations and other information from our ontology(KAON). 

Just a comment, I'm not really answering the questions you ask.

It would seem that frequency and, possibly, spelling play a big role in
google's "did you mean" strategy.

So if instead of searching "lucen index java" it suggests "lucent index
java" instead of what I was searching for "lucene index java" because
lucent shows up much more often than lucene. 

It looks like for the most part this strategy works fine. I know that
it's not too hard to get a list of words and their frequencies, but I'm
not sure what the performence implication would be. Also, you'd need to
figure out a misspelling strategy, but I suspect that's not too hard.

>  1. It would be nice to be able to navigate the Query object created by the
> QueryParser.parse(String) and modify the Query expanding certain clauses
> prior to calling Query.toString() to create the "did you mean" searches.
> This would require accessor methods to navigate the query clauses and
> methods to actually change the Query. These do not appear to be present in
> the current API. To our minds the inferior alternative is to modify the
> QueryParser itself to do the expansion and build in a expand/nonexpand
> instruction into the QueryParser grammar. Does anyone have better ideas? 
>  2. A related issue is that we are basically happy with the standard Lucene
> QueryParser though we need to make some minor changes to the grammar. In
> this case it would be convenient to create an equivalent of the
> Query.toString() method to serialize conforming to new grammar outside of
> the Query class. The problem here is there don't appear to be enough
> accessor methods in the Query classes to write a new X.toString(Query). 
>  Richard and Kiran

Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message