lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From puffm...@darksleep.com (Steven J. Owens)
Subject Re: Question on the FAQ list with filters
Date Thu, 28 Mar 2002 03:17:51 GMT
On Wed, Mar 27, 2002 at 09:26:29PM -0500, I wrote:
> On Wed, Mar 27, 2002 at 03:52:21PM -0600, Armbrust, Daniel C. wrote:
> > From the FAQ:
> > 16. What is filtering and how is it performed ?
> [...]
> > Is my assumption that it is faster to provide a filter to the search()
> > method, than to do a selective collation correct?  
> 
>      "It Depends."  That's more or less the point of the FAQ answer,
> though it could be more clearly expressed.  The gist of the FAQ seems
> to be that you can either do the filtering BEFORE you do the search,
> or AFTER you do the search.
> 
>      Obviously the question is, which is more expensive, filtering out
> inappropriate documents, or searching for the possible hits?  If
> filtering is cheaper, you do the filtering first, then do the search.
> If filtering is expensive, you do the search first, then do the
> filtering.  You should also factor in which is more restrictive - will
> either the filter or the search drop out a large number of the
> documents?  If you can arrange it so one is both cheaper and drops out
> the majority of the documents, you win.

     I meant to add, here, that many applications that do searching
and filtering will display the hits only a chunk at a time (typical
web search interface).  This is another situation where it would make
a lot more sense to filter after the search, since you'd only have to
filter a relatively small portion of the hits for each page of results
the user asks for. On top of that, the user may in fact get what they
were looking for in the first page or two of results.

Steven J. Owens
puff@darksleep.com

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message