Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 30638 invoked from network); 19 Jun 2002 14:39:18 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 19 Jun 2002 14:39:18 -0000 Received: (qmail 2206 invoked by uid 97); 19 Jun 2002 14:38:55 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@jakarta.apache.org Received: (qmail 1999 invoked by uid 97); 19 Jun 2002 14:38:54 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 1836 invoked by uid 98); 19 Jun 2002 14:38:53 -0000 X-Antivirus: nagoya (v4198 created Apr 24 2002) Message-ID: <050101c2179d$7c0e2000$0201a8c0@netframe.com> From: "Terry Steichen" To: "Lucene Users Group" References: Subject: Re: Peculiar Behavior with Field queries Date: Wed, 19 Jun 2002 10:27:46 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Peter, Enclosed is an xml file which reflects the structure of the documents I index. Note that it has a 'headline' field. In my WPDocument class (used by the indexer), I parse this xml file into its components and insert them as Fields into the Document class. Specifically, I put the contents of the 'headline' xml field into a Field called "headline" and also into a Field called "l_headline". The former is stored, indexed and tokenized. The latter is stored, indexed and *not* tokenized. Upon retrieval, I am able to readily display both the "headline" and "l_headline" fields. But I am able to search *only* on the headline field. (BTW, I realize that I must include the entire, literal headline to match "l_headline".) As long as I'm mentioning problems/observations, I find that I am able to search on all fields (other than the 'l_headline' field) using the "*" wildcard - but *only* when the preceding letter is lower case. For example, I have another field called "category" and one such value is "NAT". I can match this with "category:NAT", "category:nat", or "category:n*". But I cannot match with "category:N*". Also, while the "*" wildcard works fine (at the end and/or in the middle of a term), the '?' wildcard doesn't work at all. Regards, Terry PS: I am using the StandardAnalyzer and QueryParser that comes with Lucene 1.2rc5. ------------ Example XML file that I index --------------------
The Knockout Paunch Peter Piper FAT 20020616 Post This Father's Day, let us praise Dad by celebrating that ever-expanding, much-maligned monument to the good life that he always carries close to his heart -- his paunch, his shelf, his spare tire, his front porch, his Buddha, his bay window, his beer gut, his potbelly.

]]>
This Father's Day, let us praise Dad by celebrating that ever-expanding, much-maligned monument to the good life that he always carries close to his heart -- his paunch, his shelf, his spare tire, his front porch, his Buddha, his bay window, his beer gut, his potbelly.

The potbelly is the essence of distilled Dadness. It's as much a part of the architecture of middle-aged masculinity as creaky knees or hairy ears or the bald spot that keeps growing, wiping out wilderness faster than the Sahara.

---Stuff snipped for brevity --

What does the perfect potbelly say?

"It says, 'God, that guy's got a great beer gut,' " Decaire declares. "I saw a guy with a great gut in the store today. He had on a Hawaiian shirt and white shorts. The Hawaiian shirt just gave great form to his gut, the way a good bra gives form to breasts. It was just perfect. It was holding itself up -- nothing was hanging over the belt. I said, 'Great gut.' He said, 'Thanks.'

"It was beautiful."

]]> A51288-2002Jun14
----- Original Message ----- From: "Peter Carlson" To: "Lucene Users List" Sent: Wednesday, June 19, 2002 9:47 AM Subject: Re: Peculiar Behavior with Field queries > Terry, > > Please provide the exact example of the text so we can look at it and > evaluate what's going on. > > -Peter > > > On 6/19/02 5:20 AM, "Terry Steichen" wrote: > > > Peter, > > > > I added a new field called 'l_headline' (for literal headline) which I set > > so it was searchable and included in the index and not tokenized. But the > > query (using a phrase that is an exact match for the headline, but which may > > include stop words) still fails. Even when I apply this to an article whose > > headline contains no stop words (so the headline:"phrase"' returns the > > article), the 'l_headline' fails to produce anything. > > > > I can do a 'doc.get("l_headline")' and it shows the proper phrase has been > > included. > > > > Any ideas why this won't let me do a literal match? Seems like it should > > work fine. > > > > Regards, > > > > Terry > > > -- > To unsubscribe, e-mail: > For additional commands, e-mail: > > -- To unsubscribe, e-mail: For additional commands, e-mail: