lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Carlson <carl...@bookandhammer.com>
Subject Re: Peculiar Behavior with Field queries
Date Mon, 17 Jun 2002 14:03:34 GMT
I don't think there is a work around without eliminating stop words, because
when you index, you will not include the stop words in the index.

I guess one option would be to create an Analyzer to use when creating the
index that would not eliminate the stop words, then a change the
QueryParser.jj to use this analyzer when searching for phrases.
For all other queries you could use a different analyzer that would
eliminate the stop words.

I don't find this a problem personally as long as you tell the person that
you have eliminated these terms from what they are searching for. As an
example, in Google they tell you which terms were just common words that
have been eliminated from your query string.

--Peter

On 6/17/02 5:32 AM, "Terry Steichen" <terry@net-frame.com> wrote:

> This apparent inability of Lucene to find articles containing literal
> phrases (if the phrase contains stop words) can be, I think, a severe
> limitation.  I wonder if there is any workaround (short of eliminating stop
> words)?
> 
> Terry
> 
> ----- Original Message -----
> From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> Sent: Sunday, June 16, 2002 11:04 PM
> Subject: Re: Peculiar Behavior with Field queries
> 
> 
>> I believe you are correct.
>> 
>> istO tisO sitO itSo Otsi Osit Otis
>> 
>> --- Terry Steichen <terry@net-frame.com> wrote:
>>> Oits,
>>> 
>>> You may be right that the stop words are at work here.  What I was
>>> expecting
>>> to do is be able to match a specific phrase ("on the job"), even if
>>> it
>>> includes stop words.  But I guess that may not be the way that Lucene
>>> works,
>>> right?
>>> 
>>> Regards,
>>> 
>>> Terry
>>> 
>>> ----- Original Message -----
>>> From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
>>> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>>> Sent: Sunday, June 16, 2002 7:13 PM
>>> Subject: Re: Peculiar Behavior with Field queries
>>> 
>>> 
>>>> Hello,
>>>> 
>>>> I'm not sure what the problem is :)
>>>> What do you expect to get back?
>>>> Are you wondering why 'on the' part is not matched?
>>>> If so, it's probably because both 'on' and 'the' are in the list of
>>>> stop words, which are thrown out when/before indexing.
>>>> 
>>>> Otis
>>>> 
>>>> --- Terry Steichen <terry@net-frame.com> wrote:
>>>>> Hello,
>>>>> 
>>>>> I'm using Lucene (1.2RC5) and, when indexing, I include a field
>>>>> called "headline" using the following line of code in the
>>> document I
>>>>> create to use for indexing:
>>>>> 
>>>>>       addField("headline", root.elementText("headline"), true,
>>> true,
>>>>> true, doc);
>>>>> 
>>>>> When I search on headline:term1, it works just fine.  But I've
>>>>> noticed that if I query using, for example,
>>>>> 
>>>>>         headline:"on the job"
>>>>> 
>>>>> I will get returned all items that have the term 'job' in their
>>>>> headline.
>>>>> 
>>>>> I presume I've overlooked something and would appreciate any
>>>>> suggestions on what that might be.
>>>>> 
>>>>> Regards,
>>>>> 
>>>>> Terry Steichen
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> __________________________________________________
>>>> Do You Yahoo!?
>>>> Yahoo! - Official partner of 2002 FIFA World Cup
>>>> http://fifaworldcup.yahoo.com
>>>> 
>>>> --
>>>> To unsubscribe, e-mail:
>>> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
>>>> For additional commands, e-mail:
>>> <mailto:lucene-user-help@jakarta.apache.org>
>>>> 
>>> 
>>> 
>>> --
>>> To unsubscribe, e-mail:
>>> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
>>> For additional commands, e-mail:
>>> <mailto:lucene-user-help@jakarta.apache.org>
>>> 
>> 
>> 
>> __________________________________________________
>> Do You Yahoo!?
>> Yahoo! - Official partner of 2002 FIFA World Cup
>> http://fifaworldcup.yahoo.com
>> 
>> --
>> To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
>> For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
>> 
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> 
> 


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message