lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Sturge <tstu...@hi5.com>
Subject Re: Issue upgrading from lucene 2.3.2 to 2.4 (moving from bitset to docidset)
Date Wed, 10 Dec 2008 20:21:52 GMT
Mike, Mike,

I have an implementation of FieldCacheTermsFilter (which uses field cache to
filter for a predefined set of terms) around if either of you are
interested. It is faster than materializing the filter roughly when the
filter matches more than 1% of the documents.

So it's not better for a large set of small filters (which you can
materialize on the spot) but it is better for a small set (but more than 32)
large filters. 

Let me know if you're interested and I'll send it in.

Tim

On 12/10/08 3:34 AM, "Michael McCandless" <lucene@mikemccandless.com> wrote:

> 
> In your approach, roughly how many filters do you have cached?  It
> seems like it could be quite a few (one for each color, one for each
> type, etc)?
> 
> You might be able to modify the new (on Lucene trunk)
> FieldCacheRangeFilter to achieve this same filtering without actually
> having to materialize the full bitset for each.
> 
> Mike
> 
> Michael Stoppelman wrote:
> 
>> Yeah looks similar to what we've implemented for ourselves (although I
>> haven't looked at the implementation). We've got quite a custom
>> version of
>> lucene at this point. Using Solr at this point really isn't a viable
>> option,
>> but thanks for pointing this out.
>> 
>> M
>> 
>> On Tue, Dec 9, 2008 at 1:47 AM, Michael McCandless <
>> lucene@mikemccandless.com> wrote:
>> 
>>> 
>>> This use case sounds alot like faceted navigation, which Solr
>>> provides.
>>> 
>>> Mike
>>> 
>>> 
>>> Michael Stoppelman wrote:
>>> 
>>> Hi all,
>>>> 
>>>> I'm working on upgrading to Lucene 2.4.0 from 2.3.2 and was trying
>>>> to
>>>> integrate the new DodIdSet changes since
>>>> o.a.l.search.Filter#bits() method
>>>> is now depreciated. For our app we actually heavily rely on bits
>>>> from the
>>>> Filter to do post-query filtering (I explain why below).
>>>> 
>>>> For example, if someone searches for product: "ipod" and then
>>>> filters a
>>>> type: "nano" (e.g. mini/nano/regular) AND color: "red" (e.g.
>>>> red/yellow/blue). In our current model the results are gathered in
>>>> the
>>>> following way:
>>>> 
>>>> 1) "ipod" w/o attributes is run and the results are stored in a
>>>> hitcollector
>>>> 2) "ipod" results are now filtered for color="red" AND type="mini"
>>>> using
>>>> the
>>>> lucene Filters
>>>> 3) The filtered results are returned to the user.
>>>> 
>>>> The reason that the attributes are filtered post-query is so that
>>>> we can
>>>> return the other types and colors the user can filter by in the
>>>> future.
>>>> Meaning the UI would be able to show "blue", "green", "pink",
>>>> etc... if we
>>>> pre-filtered results by color and type before hand we wouldn't
>>>> know what
>>>> the
>>>> other filter options would be there for a broader result set.
>>>> 
>>>> Does anyone else have this use case? I'd imagine other folks are
>>>> probably
>>>> doing similar things to accomplish this.
>>>> 
>>>> M
>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>> 
>>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message