lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DMGoodst...@lbl.gov
Subject Re: Peculiar Behavior with Field queries
Date Wed, 19 Jun 2002 15:03:11 GMT
I have also experienced the same wildcard behavior (failure of "*" 
preceded by uppercase letter, and failure of "?") with StandardAnalyzer 
and QueryParser, as described by Terry.
--David 

----- Original Message -----
From: "Terry Steichen" <terry@net-frame.com>
Date: Wednesday, June 19, 2002 7:27 am
Subject: Re: Peculiar Behavior with Field queries

> Peter,
> 
> Enclosed is an xml file which reflects the structure of the 
> documents I
> index.  Note that it has a 'headline' field.  In my WPDocument 
> class (used
> by the indexer), I parse this xml file into its components and 
> insert them
> as Fields into the Document class.  Specifically, I put the 
> contents of the
> 'headline' xml field into a Field called "headline" and also into 
> a Field
> called "l_headline".  The former is stored, indexed and tokenized. 
> The
> latter is stored, indexed and *not* tokenized.
> 
> Upon retrieval, I am able to readily display both the "headline" and
> "l_headline" fields.  But I am able to search *only* on the 
> headline field.
> (BTW, I realize  that I must include the entire, literal headline 
> to match
> "l_headline".)
> 
> As long as I'm mentioning problems/observations, I find that I am 
> able to
> search on all fields (other than the 'l_headline' field) using the "*"
> wildcard - but *only* when the preceding letter is lower case.  
> For example,
> I have another field called "category" and one such value is 
> "NAT".  I can
> match this with "category:NAT", "category:nat", or "category:n*".  
> But I
> cannot match with "category:N*".
> 
> Also, while the "*" wildcard works fine (at the end and/or in the 
> middle of
> a term), the '?' wildcard doesn't work at all.
> 
> Regards,
> 
> Terry
> 
> PS: I am using the StandardAnalyzer and QueryParser that comes 
> with Lucene
> 1.2rc5.
> 
> ------------ Example XML file that I index --------------------
> <?xml version="1.0" encoding="iso-8859-1"?>
> 
> <article>
>  <headline>The Knockout Paunch</headline>
>  <author>Peter Piper
>  <category>FAT</category>
>  <pub_date create_date="20020616" event_date="20020616" 
> timestamp="22:23PM">20020616</pub_date>
>  <placement edition="EE" section="EZ" page="F01 " 
> slug="POTBELLIES16"/>  <origin sourcenumber="6">Post</origin>
>  <webexec created="Mon Jun 17 23:15:33 EDT 2002" module="v_wp13"/>
>  <summary><![CDATA[<p>This Father's Day, let us praise Dad by 
> celebratingthat ever-expanding, much-maligned monument to the good 
> life that he always
> carries close to his heart -- his paunch, his shelf, his spare 
> tire, his
> front porch, his Buddha, his bay window, his beer gut, his
> potbelly.</p>]]></summary>
>  <body paras="74"><![CDATA[ <p>This Father's Day, let us praise 
> Dad by
> celebrating that ever-expanding, much-maligned monument to the 
> good life
> that he always carries close to his heart -- his paunch, his 
> shelf, his
> spare tire, his front porch, his Buddha, his bay window, his beer 
> gut, his
> potbelly.</p> <p>The potbelly is the essence of distilled Dadness. 
> It's as
> much a part of the architecture of middle-aged masculinity as 
> creaky knees
> or hairy ears or the bald spot that keeps growing, wiping out 
> wildernessfaster than the Sahara.</p>
> 
> ---Stuff snipped for brevity --
> 
> <p>What does the perfect potbelly say?</p> <p>"It says, 'God, that

> guy's got
> a great beer gut,' " Decaire declares. "I saw a guy with a great 
> gut in the
> store today. He had on a Hawaiian shirt and white shorts. The 
> Hawaiian shirt
> just gave great form to his gut, the way a good bra gives form to 
> breasts.It was just perfect. It was holding itself up -- nothing 
> was hanging over
> the belt. I said, 'Great gut.' He said, 'Thanks.'</p> <p>"It was
> beautiful."</p>]]></body>
>  <doc_name>A51288-2002Jun14</doc_name>
>  <references>
>    <ref_articles>
>      <ref_article/>
>    </ref_articles>
>    <urls>
>      <url/>
>    </urls>
>    <graphics>
>      <graphic/>
>    </graphics>
>  </references>
> 
> 
> 
> ----- Original Message -----
> From: "Peter Carlson" <carlson@bookandhammer.com>
> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> Sent: Wednesday, June 19, 2002 9:47 AM
> Subject: Re: Peculiar Behavior with Field queries
> 
> 
> > Terry,
> >
> > Please provide the exact example of the text so we can look at 
> it and
> > evaluate what's going on.
> >
> > -Peter
> >
> >
> > On 6/19/02 5:20 AM, "Terry Steichen" <terry@net-frame.com> wrote:
> >
> > > Peter,
> > >
> > > I added a new field called 'l_headline' (for literal headline) 
> which I
> set
> > > so it was searchable and included in the index and not 
> tokenized.  But
> the
> > > query (using a phrase that is an exact match for the headline, 
> but which
> may
> > > include stop words) still fails.  Even when I apply this to an 
> articlewhose
> > > headline contains no stop words (so the headline:"phrase"' 
> returns the
> > > article), the 'l_headline' fails to produce anything.
> > >
> > > I can do a 'doc.get("l_headline")' and it shows the proper 
> phrase has
> been
> > > included.
> > >
> > > Any ideas why this won't let me do a literal match?  Seems 
> like it
> should
> > > work fine.
> > >
> > > Regards,
> > >
> > > Terry
> >
> >
> > --
> > To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
> >
> >
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-user-
> unsubscribe@jakarta.apache.org>For additional commands, e-mail: 
> <mailto:lucene-user-help@jakarta.apache.org>
> 
> 


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message