lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cedric Ho" <cedric...@gmail.com>
Subject Re: WildcardQuery and SpanQuery
Date Wed, 18 Jul 2007 10:30:13 GMT
Thanks for the quick response Paul =)

However I am lost while looking at the surround package. Are you
suggesting I can solve my problem at hand using the surround package?


On 7/18/07, Paul Elschot <paul.elschot@xs4all.nl> wrote:
> On Wednesday 18 July 2007 05:58, Cedric Ho wrote:
> > Hi everybody,
> >
> > We recently need to support wildcard search terms "*", "?" together
> > with SpanQuery. It seems that there's no SpanWildcardQuery available.
> > After looking into the lucene source code for a while, I guess we can
> > either:
> >
> > 1. Use SpanRegexQuery, or
> >
> > 2. Write our own SpanWildcardQuery, and implements the
> > rewrite(IndexReader) method to rewrite the query into a SpanOrQuery
> > with some SpanTermQuery.
> >
> > Of the two approaches, Option 1 seems to be easier. But I am rather
> > concerned about the performance of using regular expression. On the
> > other hand, I am not sure if there are any other concerns I am not
> > aware of for option 2 (i.e. is there a reason why there's no
> > SpanWildcardQuery in the first place?)
> >
> > Any advices ?
>
> The basic problem you are facing is that in Lucene
> the expansion of the terms is tightly coupled to the generation
> of a combination query using the expanded terms.
>
> In contrib/surround the term expansion and query generation
> are decoupled using a visitor pattern for the terms. The code is here:
> http://svn.apache.org/viewvc/lucene/java/trunk/contrib/surround/src/java/org/apache/lucene/queryParser/surround/query
>
> In surround a wild card term can provide either an OR of
> normal term queries, or a SpanOrQuery of span term queries.
> This query generation is in class SimpleTerm, which has one method
> for a normal boolean OR query over the terms, and one for
> a span query for the terms.
>
> In both cases surround uses a regular expression to expand
> the matching terms, but that could be changed to use
> another wildcard expansion mechanisms than the ones in
> SrndPrefixQuery and SrndTruncQuery, which
> are subclasses of SimpleTerm.
>
> With the term expansion and the query combination split,
> it is also necessary to limit the maximum number of expanded
> terms in another way than Lucene does. In surround the
> classes BasicQueryFactory and TooManyBasicQueries are
> used for that.
>
> Regards,
> Paul Elschot
>
>
>
> >
> > Cedric
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message