Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 3192 invoked from network); 10 Dec 2008 20:25:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 10 Dec 2008 20:25:09 -0000 Received: (qmail 50663 invoked by uid 500); 10 Dec 2008 20:25:13 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 50629 invoked by uid 500); 10 Dec 2008 20:25:13 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 50618 invoked by uid 99); 10 Dec 2008 20:25:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Dec 2008 12:25:13 -0800 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=DNS_FROM_RFC_BOGUSMX,RCVD_IN_DNSWL_MED,RCVD_NUMERIC_HELO,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of TSturge@hi5.com designates 64.18.1.32 as permitted sender) Received: from [64.18.1.32] (HELO psmtp.com) (64.18.1.32) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 10 Dec 2008 20:24:56 +0000 Received: from source ([63.240.6.3]) (using TLSv1) by exprod6ob114.postini.com ([64.18.5.12]) with SMTP ID DSNKSUAk9YPrhkdvsxPUpfFS78palXA0j2QM@postini.com; Wed, 10 Dec 2008 12:24:35 PST Received: from d01smtp07.Mi8.com ([172.16.1.114]) by Outbound02.Mi8.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 10 Dec 2008 15:22:12 -0500 Received: from MI8NYCMAIL04.Mi8.com ([172.16.1.157]) by d01smtp07.Mi8.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 10 Dec 2008 15:22:12 -0500 Received: from 66.218.169.47 ([66.218.169.47]) by MI8NYCMAIL04.Mi8.com ([172.16.1.204]) via Exchange Front-End Server mi8owa.mi8.com ([172.16.1.104]) with Microsoft Exchange Server HTTP-DAV ; Wed, 10 Dec 2008 20:21:54 +0000 User-Agent: Microsoft-Entourage/12.13.0.080930 Date: Wed, 10 Dec 2008 12:21:52 -0800 Subject: Re: Issue upgrading from lucene 2.3.2 to 2.4 (moving from bitset to docidset) From: Tim Sturge To: Message-ID: Thread-Topic: Issue upgrading from lucene 2.3.2 to 2.4 (moving from bitset to docidset) Thread-Index: AclbBO/sXlq0Jhd23UivglkunQeo4Q== In-Reply-To: <298653D8-1AFA-4442-A0A2-AE8968BB3542@mikemccandless.com> Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit X-OriginalArrivalTime: 10 Dec 2008 20:22:12.0507 (UTC) FILETIME=[FC254EB0:01C95B04] X-Virus-Checked: Checked by ClamAV on apache.org Mike, Mike, I have an implementation of FieldCacheTermsFilter (which uses field cache to filter for a predefined set of terms) around if either of you are interested. It is faster than materializing the filter roughly when the filter matches more than 1% of the documents. So it's not better for a large set of small filters (which you can materialize on the spot) but it is better for a small set (but more than 32) large filters. Let me know if you're interested and I'll send it in. Tim On 12/10/08 3:34 AM, "Michael McCandless" wrote: > > In your approach, roughly how many filters do you have cached? It > seems like it could be quite a few (one for each color, one for each > type, etc)? > > You might be able to modify the new (on Lucene trunk) > FieldCacheRangeFilter to achieve this same filtering without actually > having to materialize the full bitset for each. > > Mike > > Michael Stoppelman wrote: > >> Yeah looks similar to what we've implemented for ourselves (although I >> haven't looked at the implementation). We've got quite a custom >> version of >> lucene at this point. Using Solr at this point really isn't a viable >> option, >> but thanks for pointing this out. >> >> M >> >> On Tue, Dec 9, 2008 at 1:47 AM, Michael McCandless < >> lucene@mikemccandless.com> wrote: >> >>> >>> This use case sounds alot like faceted navigation, which Solr >>> provides. >>> >>> Mike >>> >>> >>> Michael Stoppelman wrote: >>> >>> Hi all, >>>> >>>> I'm working on upgrading to Lucene 2.4.0 from 2.3.2 and was trying >>>> to >>>> integrate the new DodIdSet changes since >>>> o.a.l.search.Filter#bits() method >>>> is now depreciated. For our app we actually heavily rely on bits >>>> from the >>>> Filter to do post-query filtering (I explain why below). >>>> >>>> For example, if someone searches for product: "ipod" and then >>>> filters a >>>> type: "nano" (e.g. mini/nano/regular) AND color: "red" (e.g. >>>> red/yellow/blue). In our current model the results are gathered in >>>> the >>>> following way: >>>> >>>> 1) "ipod" w/o attributes is run and the results are stored in a >>>> hitcollector >>>> 2) "ipod" results are now filtered for color="red" AND type="mini" >>>> using >>>> the >>>> lucene Filters >>>> 3) The filtered results are returned to the user. >>>> >>>> The reason that the attributes are filtered post-query is so that >>>> we can >>>> return the other types and colors the user can filter by in the >>>> future. >>>> Meaning the UI would be able to show "blue", "green", "pink", >>>> etc... if we >>>> pre-filtered results by color and type before hand we wouldn't >>>> know what >>>> the >>>> other filter options would be there for a broader result set. >>>> >>>> Does anyone else have this use case? I'd imagine other folks are >>>> probably >>>> doing similar things to accomplish this. >>>> >>>> M >>>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-user-help@lucene.apache.org >>> >>> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org