Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 22288 invoked from network); 15 May 2006 22:45:33 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 15 May 2006 22:45:33 -0000 Received: (qmail 49252 invoked by uid 500); 15 May 2006 22:45:26 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 49225 invoked by uid 500); 15 May 2006 22:45:26 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 49214 invoked by uid 99); 15 May 2006 22:45:26 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 May 2006 15:45:26 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [69.55.225.129] (HELO ehatchersolutions.com) (69.55.225.129) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 May 2006 15:45:25 -0700 Received: by ehatchersolutions.com (Postfix, from userid 504) id AA57633CAA4; Mon, 15 May 2006 18:45:04 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on javelina X-Spam-Level: Received: from [172.16.1.101] (va-71-53-203-135.dhcp.sprint-hsd.net [71.53.203.135]) by ehatchersolutions.com (Postfix) with ESMTP id EE11A33CAA1 for ; Mon, 15 May 2006 18:45:03 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v750) In-Reply-To: References: Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Erik Hatcher Subject: Re: Aggregating category hits Date: Mon, 15 May 2006 18:45:01 -0400 To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.750) X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00, RCVD_IN_SORBS_DUL autolearn=no version=3.1.1 X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N On May 15, 2006, at 5:07 PM, Marvin Humphrey wrote: > If you needed to know not just the total number of hits, but the > number of hits in each "category", how would you handle that? > > For instance, a search for "egg" would have to produce the 20 most > relevant documents for "egg", but also a list like this: > > Holiday & Seasonal / Easter 75 > Books / Cooking 52 > Miscellaneous 44 > Kitchen Collectibles 43 > Hobbies / Crafts 17 > [...] > > It seems to me that you'd have to retrieve each hit's stored fields > and examine the contents of a "category" field. That's a lot of > overhead. Is there another way? My first implementation of faceted browsing uses BitSet's that get pre-loaded for each category value (each unique term in a "category" field, for example). And to intersect that with an actual Query, it gets run through the QueryFilter to get its BitSet and then AND'd together with each of the category BitSet's. Sounds like a lot, but for my applications there are not tons of these BitSet's and the performance has been outstanding. Now that I'm doing more with Solr, I'm beginning to leverage its amazing caching infrastructure and replacing BitSet's with DocSet's. Erik --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org