lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "renou oki" <occhi...@gmail.com>
Subject Re: Problem with search an exact word and stemming
Date Fri, 27 Jun 2008 07:33:02 GMT
 Thanks for the reply.

I will try to add an other data field.
I thought about this solution but i was not very sure. I thought that was an
easier solution to do that...


best regards
Renou

2008/6/26 Matthew Hall <mhall@informatics.jax.org>:

> You could also add another data field to the index, with an untokenized
> version of your data, and then use a multifield query to go against both the
> stemmed and exact match parts of your search at the same time.
>
> This is a technique I've used quite often on my project with various
> different requirements for the second field.  Mind you it makes the indexes
> bigger, but unless your dataset is large its not really a huge problem.
>
> Matt
>
> Erick Erickson wrote:
>
>> The way I've solved this is to index the stemmed *and* a special
>> token at the same position (see Synonym Analyzer). The From your
>> example, say you're indexing progresser. You'd go ahead and index the
>> stemmed version , "progress", AND you'd also index "progresser$"
>> at the same offset. Now, when you want exact matches, search for
>> the token with the $ at the end.
>>
>> This does make your index a bit larger, but not as much as you'd expect.
>>
>> Best
>> Erick
>>
>> On Wed, Jun 25, 2008 at 4:21 AM, renou oki <occhipin@gmail.com> wrote:
>>
>>
>>
>>> Hello,
>>>
>>> I have a stemmed index, but i want to search the exact form of a word.
>>> I use French Analyzer, so for instance "progression", "progresser" are
>>> indexed with the linguistic root "progress".
>>> But if I want to search the word "progress" (and only this word), I have
>>> to
>>> many hits (because of "progression", "progresser"...)
>>> The field is indexed, tokenized and no store...
>>>
>>> Is there a way to do this, I mean to search an exact word in a stemmed
>>> index
>>> ?
>>> I suppose that I have to use the same analyzer for indexing and
>>> searching.
>>>
>>>
>>> I try with a PhraseQuery, with quotes...
>>>
>>> Ps : I use lucene 1.9.1
>>>
>>> Thanks
>>> Renald
>>>
>>>
>>>
>>
>>
>>
>
> --
> Matthew Hall
> Software Engineer
> Mouse Genome Informatics
> mhall@informatics.jax.org
> (207) 288-6012
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message