lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Terry Steichen" <te...@net-frame.com>
Subject Re: Peculiar Behavior with Field queries
Date Mon, 17 Jun 2002 14:04:51 GMT
Could you handle this by including extra fields?  Let's say we're dealing
with a database of articles, and that we wanted to do regular as well as
literal phrase searches on the 'headline' field.  What would happen if,
during indexing, you created a second headline field which you defined as
searchable but *not* tokenized.  Could you then apply the phrase query
(which included stop words) successfully against the second headline field
(recognizing that you would have to match the second headline's text
exactly)?

Terry

----- Original Message -----
From: "Peter Carlson" <carlson@bookandhammer.com>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Monday, June 17, 2002 10:03 AM
Subject: Re: Peculiar Behavior with Field queries


> I don't think there is a work around without eliminating stop words,
because
> when you index, you will not include the stop words in the index.
>
> I guess one option would be to create an Analyzer to use when creating the
> index that would not eliminate the stop words, then a change the
> QueryParser.jj to use this analyzer when searching for phrases.
> For all other queries you could use a different analyzer that would
> eliminate the stop words.
>
> I don't find this a problem personally as long as you tell the person that
> you have eliminated these terms from what they are searching for. As an
> example, in Google they tell you which terms were just common words that
> have been eliminated from your query string.
>
> --Peter
>
> On 6/17/02 5:32 AM, "Terry Steichen" <terry@net-frame.com> wrote:
>
> > This apparent inability of Lucene to find articles containing literal
> > phrases (if the phrase contains stop words) can be, I think, a severe
> > limitation.  I wonder if there is any workaround (short of eliminating
stop
> > words)?
> >
> > Terry
> >
> > ----- Original Message -----
> > From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
> > To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> > Sent: Sunday, June 16, 2002 11:04 PM
> > Subject: Re: Peculiar Behavior with Field queries
> >
> >
> >> I believe you are correct.
> >>
> >> istO tisO sitO itSo Otsi Osit Otis
> >>
> >> --- Terry Steichen <terry@net-frame.com> wrote:
> >>> Oits,
> >>>
> >>> You may be right that the stop words are at work here.  What I was
> >>> expecting
> >>> to do is be able to match a specific phrase ("on the job"), even if
> >>> it
> >>> includes stop words.  But I guess that may not be the way that Lucene
> >>> works,
> >>> right?
> >>>
> >>> Regards,
> >>>
> >>> Terry
> >>>
> >>> ----- Original Message -----
> >>> From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
> >>> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> >>> Sent: Sunday, June 16, 2002 7:13 PM
> >>> Subject: Re: Peculiar Behavior with Field queries
> >>>
> >>>
> >>>> Hello,
> >>>>
> >>>> I'm not sure what the problem is :)
> >>>> What do you expect to get back?
> >>>> Are you wondering why 'on the' part is not matched?
> >>>> If so, it's probably because both 'on' and 'the' are in the list of
> >>>> stop words, which are thrown out when/before indexing.
> >>>>
> >>>> Otis
> >>>>
> >>>> --- Terry Steichen <terry@net-frame.com> wrote:
> >>>>> Hello,
> >>>>>
> >>>>> I'm using Lucene (1.2RC5) and, when indexing, I include a field
> >>>>> called "headline" using the following line of code in the
> >>> document I
> >>>>> create to use for indexing:
> >>>>>
> >>>>>       addField("headline", root.elementText("headline"), true,
> >>> true,
> >>>>> true, doc);
> >>>>>
> >>>>> When I search on headline:term1, it works just fine.  But I've
> >>>>> noticed that if I query using, for example,
> >>>>>
> >>>>>         headline:"on the job"
> >>>>>
> >>>>> I will get returned all items that have the term 'job' in their
> >>>>> headline.
> >>>>>
> >>>>> I presume I've overlooked something and would appreciate any
> >>>>> suggestions on what that might be.
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>> Terry Steichen
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> __________________________________________________
> >>>> Do You Yahoo!?
> >>>> Yahoo! - Official partner of 2002 FIFA World Cup
> >>>> http://fifaworldcup.yahoo.com
> >>>>
> >>>> --
> >>>> To unsubscribe, e-mail:
> >>> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> >>>> For additional commands, e-mail:
> >>> <mailto:lucene-user-help@jakarta.apache.org>
> >>>>
> >>>
> >>>
> >>> --
> >>> To unsubscribe, e-mail:
> >>> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> >>> For additional commands, e-mail:
> >>> <mailto:lucene-user-help@jakarta.apache.org>
> >>>
> >>
> >>
> >> __________________________________________________
> >> Do You Yahoo!?
> >> Yahoo! - Official partner of 2002 FIFA World Cup
> >> http://fifaworldcup.yahoo.com
> >>
> >> --
> >> To unsubscribe, e-mail:
> > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> >> For additional commands, e-mail:
> > <mailto:lucene-user-help@jakarta.apache.org>
> >>
> >
> >
> > --
> > To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
> >
> >
>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message