lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <ysee...@gmail.com>
Subject Re: Splitting and matching words
Date Sun, 25 Jun 2006 16:48:22 GMT
On 6/25/06, Yonik Seeley <yseeley@gmail.com> wrote:
>   1) a new QueryParser smart enough to make a boolean query instead of
> a MultiPhraseQuery.   "Power Shot" OR "PowerShot"

Thinking about this option a bit more...
The problem is ambiguity.  Sometimes a MultiPhraseQuery is the correct
interpretation and sometimes a boolean query is needed.  The same
problem exists on the query side for multi-token synonyms.  There
isn't enough information about what the "synonyms" actually are.

Take the case of lap/0 top/1 notebook/1  (where /0 and /1 are token positions).
There isn't enough info to understand if notebook is a synonym for
"top" or for "lap top".
Even if we added extra info (I recently committed a Lucene patch to
allow subclassing Token), it's not an easy problem.

Consider something like "my PowerShot lap-top", and trying to
represent that with a boolean query of phrase queries... you need all
the possibilities.

"my Power Shot lap top"
"my PowerShot lap top"
"my Power Shot laptop"
"my PowerShot laptop"

perhaps span queries could avoid generating all the possibilities...

-Yonik

Mime
View raw message