lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <yo...@apache.org>
Subject Re: Index & search questions; special cases
Date Mon, 13 Nov 2006 18:52:25 GMT
On 11/12/06, Michael Imbeault <michael.imbeault@sympatico.ca> wrote:
> - Somewhat related : Let's say I index "Polymyxin B". If I stopword
> single letters, would a phrase search ("Polymyxin B") still find the
> right documents (I don't think so, but still)? If not, I'll have to
> index single letters; how do I prevent the same problem as in the first
> question (i.e., a search on Polymyxin B yielding documents with
> Polymyxin and B, but not close to one another).

The general problem seems that you can tell what should be in a phrase
search and what shouldn't

You could try throwing everything in a sloppy phrase query, so at
least scores will go up when terms are closer together (in general).

You could also try an exact phrase query, and if you don't get enough
results, follow it up with another strategy (like what you have
below).

> My thought is to parse the user query and rephrase it to do phrase
> searches on nearby terms containing single letters / numbers. If an user
> search for HIV 1 hepatitis, I'd rewrite it as ("HIV 1" AND hepatitis) OR
> ("1 hepatitis" AND hiv). Is it a sensible solution?

That might work.
Whatever general strategy you end up trying, you can probably boost
relevancy with some domain specific knowledge injected with something
like the SynonymFilter.

-Yonik

Mime
View raw message