Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 85073 invoked from network); 19 Jun 2002 22:06:29 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 19 Jun 2002 22:06:29 -0000 Received: (qmail 18926 invoked by uid 97); 19 Jun 2002 22:06:08 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@jakarta.apache.org Received: (qmail 18841 invoked by uid 97); 19 Jun 2002 22:06:07 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 17181 invoked by uid 98); 19 Jun 2002 13:56:34 -0000 X-Antivirus: nagoya (v4198 created Apr 24 2002) To: "Lucene Users List" Date: Wed, 19 Jun 2002 06:56:15 -0700 From: "none none" Message-ID: Mime-Version: 1.0 X-Sent-Mail: off Reply-To: korfut@lycos.com X-Mailer: MailCity Service X-Priority: 3 Subject: Re: Peculiar Behavior with Field queries X-Sender-Ip: 64.187.36.2 Organization: Lycos Mail (http://www.mail.lycos.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N may be the space are no good, try something like this: BooleanQuery main = new BooleanQuery(); Query queryHeadline = QueryParser.parse(yourHeadLinePhrase, "l_headline", analyzer); main.add(queryHeadline , true, false); search... should work. bye -- On Wed, 19 Jun 2002 08:20:25 Terry Steichen wrote: >Peter, > >I added a new field called 'l_headline' (for literal headline) which I set >so it was searchable and included in the index and not tokenized. But the >query (using a phrase that is an exact match for the headline, but which may >include stop words) still fails. Even when I apply this to an article whose >headline contains no stop words (so the headline:"phrase"' returns the >article), the 'l_headline' fails to produce anything. > >I can do a 'doc.get("l_headline")' and it shows the proper phrase has been >included. > >Any ideas why this won't let me do a literal match? Seems like it should >work fine. > >Regards, > >Terry > > >----- Original Message ----- >From: "Peter Carlson" >To: "Lucene Users List" >Sent: Monday, June 17, 2002 10:32 AM >Subject: Re: Peculiar Behavior with Field queries > > >> You could do this, but you would also have to match case exactly. >> >> --Peter >> >> >> On 6/17/02 7:04 AM, "Terry Steichen" wrote: >> >> > Could you handle this by including extra fields? Let's say we're >dealing >> > with a database of articles, and that we wanted to do regular as well as >> > literal phrase searches on the 'headline' field. What would happen if, >> > during indexing, you created a second headline field which you defined >as >> > searchable but *not* tokenized. Could you then apply the phrase query >> > (which included stop words) successfully against the second headline >field >> > (recognizing that you would have to match the second headline's text >> > exactly)? >> > >> > Terry >> > >> > ----- Original Message ----- >> > From: "Peter Carlson" >> > To: "Lucene Users List" >> > Sent: Monday, June 17, 2002 10:03 AM >> > Subject: Re: Peculiar Behavior with Field queries >> > >> > >> >> I don't think there is a work around without eliminating stop words, >> > because >> >> when you index, you will not include the stop words in the index. >> >> >> >> I guess one option would be to create an Analyzer to use when creating >the >> >> index that would not eliminate the stop words, then a change the >> >> QueryParser.jj to use this analyzer when searching for phrases. >> >> For all other queries you could use a different analyzer that would >> >> eliminate the stop words. >> >> >> >> I don't find this a problem personally as long as you tell the person >that >> >> you have eliminated these terms from what they are searching for. As an >> >> example, in Google they tell you which terms were just common words >that >> >> have been eliminated from your query string. >> >> >> >> --Peter >> >> >> >> On 6/17/02 5:32 AM, "Terry Steichen" wrote: >> >> >> >>> This apparent inability of Lucene to find articles containing literal >> >>> phrases (if the phrase contains stop words) can be, I think, a severe >> >>> limitation. I wonder if there is any workaround (short of eliminating >> > stop >> >>> words)? >> >>> >> >>> Terry >> >>> >> >>> ----- Original Message ----- >> >>> From: "Otis Gospodnetic" >> >>> To: "Lucene Users List" >> >>> Sent: Sunday, June 16, 2002 11:04 PM >> >>> Subject: Re: Peculiar Behavior with Field queries >> >>> >> >>> >> >>>> I believe you are correct. >> >>>> >> >>>> istO tisO sitO itSo Otsi Osit Otis >> >>>> >> >>>> --- Terry Steichen wrote: >> >>>>> Oits, >> >>>>> >> >>>>> You may be right that the stop words are at work here. What I was >> >>>>> expecting >> >>>>> to do is be able to match a specific phrase ("on the job"), even if >> >>>>> it >> >>>>> includes stop words. But I guess that may not be the way that >Lucene >> >>>>> works, >> >>>>> right? >> >>>>> >> >>>>> Regards, >> >>>>> >> >>>>> Terry >> >>>>> >> >>>>> ----- Original Message ----- >> >>>>> From: "Otis Gospodnetic" >> >>>>> To: "Lucene Users List" >> >>>>> Sent: Sunday, June 16, 2002 7:13 PM >> >>>>> Subject: Re: Peculiar Behavior with Field queries >> >>>>> >> >>>>> >> >>>>>> Hello, >> >>>>>> >> >>>>>> I'm not sure what the problem is :) >> >>>>>> What do you expect to get back? >> >>>>>> Are you wondering why 'on the' part is not matched? >> >>>>>> If so, it's probably because both 'on' and 'the' are in the list of >> >>>>>> stop words, which are thrown out when/before indexing. >> >>>>>> >> >>>>>> Otis >> >>>>>> >> >>>>>> --- Terry Steichen wrote: >> >>>>>>> Hello, >> >>>>>>> >> >>>>>>> I'm using Lucene (1.2RC5) and, when indexing, I include a field >> >>>>>>> called "headline" using the following line of code in the >> >>>>> document I >> >>>>>>> create to use for indexing: >> >>>>>>> >> >>>>>>> addField("headline", root.elementText("headline"), true, >> >>>>> true, >> >>>>>>> true, doc); >> >>>>>>> >> >>>>>>> When I search on headline:term1, it works just fine. But I've >> >>>>>>> noticed that if I query using, for example, >> >>>>>>> >> >>>>>>> headline:"on the job" >> >>>>>>> >> >>>>>>> I will get returned all items that have the term 'job' in their >> >>>>>>> headline. >> >>>>>>> >> >>>>>>> I presume I've overlooked something and would appreciate any >> >>>>>>> suggestions on what that might be. >> >>>>>>> >> >>>>>>> Regards, >> >>>>>>> >> >>>>>>> Terry Steichen >> >>>>>>> >> >>>>>>> >> >>>>>> >> >>>>>> >> >>>>>> __________________________________________________ >> >>>>>> Do You Yahoo!? >> >>>>>> Yahoo! - Official partner of 2002 FIFA World Cup >> >>>>>> http://fifaworldcup.yahoo.com >> >>>>>> >> >>>>>> -- >> >>>>>> To unsubscribe, e-mail: >> >>>>> >> >>>>>> For additional commands, e-mail: >> >>>>> >> >>>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> To unsubscribe, e-mail: >> >>>>> >> >>>>> For additional commands, e-mail: >> >>>>> >> >>>>> >> >>>> >> >>>> >> >>>> __________________________________________________ >> >>>> Do You Yahoo!? >> >>>> Yahoo! - Official partner of 2002 FIFA World Cup >> >>>> http://fifaworldcup.yahoo.com >> >>>> >> >>>> -- >> >>>> To unsubscribe, e-mail: >> >>> >> >>>> For additional commands, e-mail: >> >>> >> >>>> >> >>> >> >>> >> >>> -- >> >>> To unsubscribe, e-mail: >> > >> >>> For additional commands, e-mail: >> > >> >>> >> >>> >> >> >> >> >> >> -- >> >> To unsubscribe, e-mail: >> > >> >> For additional commands, e-mail: >> > >> >> >> >> >> > >> > >> > -- >> > To unsubscribe, e-mail: > >> > For additional commands, e-mail: > >> > >> > >> >> >> -- >> To unsubscribe, e-mail: > >> For additional commands, e-mail: > >> >> > > >-- >To unsubscribe, e-mail: >For additional commands, e-mail: > > _________________________________________ Communicate with others using Lycos Mail for FREE! http://mail.lycos.com/ -- To unsubscribe, e-mail: For additional commands, e-mail: