lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Wildcard Searching
Date Thu, 18 Apr 2002 14:55:31 GMT
Does anyone know anything about this?

Thanks,
Otis

--- Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
> Hello,
> 
> This was a thread on lucene-user initially, but I'm copying
> lucene-dev
> as well.  Sorry about duplicates.
> 
> --- Stefan Bergstrand <stefan.bergstrand@polopoly.com> wrote:
> > Doug Cutting <DCutting@grandcentral.com> writes:
> > 
> > Just noticed this problem in my program.
> > 
> > It seems as if the analyzer passed to QueryParser.parse(), never is
> > passed to PrefixQuery (which is what my test case is parsed to).
> > 
> > A quick look in QueryParser.jj confirms this: 
> > 
> >  q = new PrefixQuery(new Term(field, term.image.substring
> >                                       (0, term.image.length()-1)));
> 
> I thought that queries such as 'rou?d' are considered wildcard
> queries
> by QueryParser.jj, and not Prefix queries, no?
> In the default definition of token in QueryParser.jj I see this:
> 
> | <PREFIXTERM:  <_TERM_START_CHAR> (<_TERM_CHAR>)* "*" >
> | <WILDTERM:  <_TERM_START_CHAR> 
>               (<_TERM_CHAR> | ( [ "*", "?" ] ))* >
> 
> Then further down in QueryParser.jj we have this:
> 
>        if (wildcard)
>          q = new WildcardQuery(new Term(field, term.image));
> 
> So a WildWuery is being constructed, not PrefixQuery, I think.
> 
> What I don't understand is why the definition of _TERM_START_CHAR
> looks
> like this:
> 
> | <#_TERM_START_CHAR: ~[ " ", "\t", "+", "-", "!", "(", ")", ":",
> "^", 
>                          "[", "]", "\"", "{", "}", "~", "*" ] >
> 
> Maybe the name is misleading, but it seems like _TERM_START_CHAR are
> the characters that a TERM can start with, because later in
> QueryParser.jj we have TERM defined as:
> 
> | <TERM:      <_TERM_START_CHAR> (<_TERM_CHAR>)*  >
> 
> and _TERM_CHAR has this definition:
> 
> | <#_TERM_CHAR: <_TERM_START_CHAR> >
> 
> So how can we have a "*" in _TERM_START_CHAR when terms are not
> allowed
> to start with a "*", and if we do have "*", how come we do not have
> "?"
> as well?
> 
> Can somebodyt correct me in every place where I made false
> statements,
> assumptions, and conclusions?
> 
> Thanks,
> Otis
> 
> > > > From: Howk, Michael [mailto:MHowk@FSC.Follett.com]
> > > > 
> > > > Also, Lucene returns the parsed version of each of our 
> > > > searches. When we
> > > > search by rou*d, Lucene parses it as rou*d (which is what we 
> > > > would expect).
> > > > But when we search by rou?d, Lucene parses it as "rou d". It 
> > > > seems to wrap
> > > > the term in quotes and replace the question mark with a 
> > > > space. Any ideas? Or
> > > > can someone give us an idea of how to understand WildcardQuery
> or
> > > > WildcardTermEnum?
> > > 
> > > It sounds like the problem is in the query parser.  Brian?
> > > 
> > > Doug
> > > 
> > > --
> > > To unsubscribe, e-mail:  
> > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > > For additional commands, e-mail:
> > <mailto:lucene-user-help@jakarta.apache.org>
> > > 
> > > 
> > 
> > -- 
> > ---------------------------
> > Stefan Bergstrand
> > Polopoly - Cultivating the information garden
> > Ph:   +46 8 506 782 67
> > Cell: +46 704 47 82 67
> > Fax:  +46 8 506 782 51
> > stefan.bergstrand@polopoly.com, http://www.polopoly.com
> > 
> > --
> > To unsubscribe, e-mail:  
> > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
> > <mailto:lucene-user-help@jakarta.apache.org>
> > 
> 
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Sports - live college hoops coverage
> http://sports.yahoo.com/
> 
> --
> To unsubscribe, e-mail:  
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
> 


__________________________________________________
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message