lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Omri Suissa <omri.sui...@diffdoof.com>
Subject Re: Lucene results filtering best practices
Date Sun, 25 Nov 2012 15:07:09 GMT
Hi,
Thanks.
Can you tell me why TermsFilter is better then filtering with a collector?

Omri


On Sun, Nov 25, 2012 at 10:54 AM, Simon Svensson <sisve@devhost.se> wrote:

>  Hi,
>
> Use a TermsFilter<http://lucene.apache.org/core/old_versioned_docs/versions/3_0_3/api/all/org/apache/lucene/search/TermsFilter.htm>.
>
>
> Constructs a filter for docs matching any of the terms added to this
> class. Unlike a RangeFilter this can be used for filtering on multiple
> terms that are not necessarily in a sequence. An example might be a
> collection of primary keys from a database query result or perhaps a choice
> of "category" labels picked by the end user. As a filter, this is much
> faster than the equivalent query (a BooleanQuery with many "should"
> TermQueries)
>
> Depending on the number of users, queries and magic domain information
> only known to you, check out the CachingWrapperFilter<http://lucene.apache.org/core/old_versioned_docs/versions/3_0_3/api/all/org/apache/lucene/search/TermsFilter.htm>
>
> Wraps another filter's result and caches it. The purpose is to allow
> filters to simply filter, and then wrap with this class to add caching.
>
> // Simon
>
> On 2012-11-25 09:32, Omri Suissa wrote:
>
>   Hi all,
>
> All the docs in my index have a field named "groupId" to enable filtering
> the search results by the user's groups. Each user have several groups
> (around 20-100 in average).
>
> Now I have 2 implementation options:
>
> 1)      Add to the query 20-100 terms (with OR) of each user group (for
> example: "content:cat AND (groupId:4 OR groupId:58 OR groupId:94 … OR
> groupId:N)")
>
> 2)      Search only the user's query and create a collector (I already have
> one) that filters the results before scoring (get all the groupId's of the
> docs and score and add only if exists in the user's group list).
>
> Regardless the time and effort of the implementation, what is better (and
> why)?
>
>
>
> Thanks,
>
> Omri
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message