lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Matching exact words
Date Thu, 26 Aug 2010 17:16:36 GMT
You'll have to change your index I'm afraid. The problem is
that all the index sees is the stemmed version (assuming
you're stemming at index time). There's no information in
the index about what the original version was, so it's impossible
to back this out.

One solution is to use copyfield to make a copy of the
input that does NOT stem, and search against (or boost)
that field when you care about stemmed/unstemmed.

And a minor clarification. The "types" you refer to aren't
really a SOLR entity. They are just a convenient collection
of tokenizers and stemmers that are provided in the schema
file. You can freely create your own types by simply mixing and
matching various varieties of these (you probably already know
this, but the phrasing of your question caused me to wonder).

See:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

Best
Erick

On Thu, Aug 26, 2010 at 7:24 AM, ahammad <ahmed.hammad@gmail.com> wrote:

>
> Hello,
>
> I have a case where if I search for the word "windows", I get results
> containing both "windows" and "window" (and probably other things like
> "windowing" etc.). Is there a way to find exact matches only?
>
> The field in which I am searching is a text field, which as I understand
> causes this behaviour. I cannot use a string field because it is very
> restricted, but what else can be done? I understand there are other types
> of
> text fields that are more strict than the standard field.
>
> Ideally I would like to keep my index the way it is, with the ability to
> force exact matches. For example, if I can search "windows -window" or
> something like that, that would be great. Or if I can wrap my query in a
> set
> of quotes to tell it to match exactly. I've seen that done before but I
> cannot get it to work.
>
> As a reference, here is my query:
>
> q={!boost b=$db v=$qq
>
> defType=$sh}&qq=windows&db=recip(ms(NOW,lastModifiedLong),3.16e-11,1,1)&sh=dismax
>
> To be quite frank, I am not very familiar with this syntax. I am just using
> whatever my old coworker left behind.
>
> Any tips on how to find exact matches or improve the above query will be
> greatly appreciated.
>
> Thanks
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Matching-exact-words-tp1353350p1353350.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message