lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe R <vinnyj...@yahoo.com>
Subject Re: Query.combine()
Date Tue, 30 May 2006 22:37:49 GMT

Oh, yeah -- wildcard and range queries.  That's an even better reason for this
method than the one I guessed (and the hunt for an optimization that nobody has
thought of continues on...)  If I understand correctly, I can still do the
combine() once on the JMS MultiSearcher if the query doesn't contain any
wildcard or range terms.

I, too, was wondering if caching df's on the MultiSearcher was an option. 
Thanks for confirming that it's worth a follow-up.


-j


--- Chuck Williams <chuck@manawiz.com> wrote:

> If I understand what you are saying, unfortunately it will not work. 
> The issue is that rewrite() needs to access the index.  E.g., a* or [a
> TO d] rewrite to disjunctions of all terms that exist in the index that
> match, respectively, the prefix or range.  To determine this set of
> terms it is necessary to access each index.  rewrite() does the
> expansion for the separate subindexes, and then combine() builds the
> larger disjunction that captures all terms for a single query that will
> work against any index.  Unless all your subindexes have precisely the
> same terms, I don't see how you could avoid the hops to access each
> subindex for rewrite().
> 
> There is another place I would suggest looking, though. 
> ParallelMultiSearcher needs to access the df's for each term and sum
> them in order to compute the Weight for the combined query.  I don't
> believe anybody has benchmarked this, but I've always believed that
> caching the df's on the central server would be a significant benefit.
> 
> Chuck
> 
> 
> Joe R wrote on 05/30/2006 11:32 AM:
> > Hello,
> >
> > I'm trying to write a MultiSearcher/ParallelMultiSearcher variation that
> uses
> > JMS to talk to its subordinate Searchers.  While running through
> MultiSearcher
> > to see where I can save some cycles or network hops, I came across
> > Query.combine().  It's called from MultiSearcher.rewrite() (as you know)
> but
> > seems to be there only to allow for different Searcher implementations in
> > MultiSearcher's subordinate Searchers.  So, am I correct in assuming that,
> if I
> > use the same Searcher to query every subordinate index, I can save myself a
> > network hop by rewriting/combine()ing the Query once, in the JMS
> MultiSearcher?
> >
> > Every other time I thought I'd found an optimization it turned out to be
> > written the way it was for a reason.  I'm wondering if that's going to be
> the
> > case here, too -- hence the question.
> >
> > Thanks for the help.
> >
> >
> > -joe
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam?  Yahoo! Mail has the best spam protection around 
> > http://mail.yahoo.com 
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >   
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message