Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F26D6E2B5 for ; Fri, 25 Jan 2013 13:38:44 +0000 (UTC) Received: (qmail 47636 invoked by uid 500); 25 Jan 2013 13:38:42 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 47596 invoked by uid 500); 25 Jan 2013 13:38:42 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 47582 invoked by uid 99); 25 Jan 2013 13:38:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jan 2013 13:38:42 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.210.51] (HELO mail-da0-f51.google.com) (209.85.210.51) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jan 2013 13:38:35 +0000 Received: by mail-da0-f51.google.com with SMTP id i30so175707dad.38 for ; Fri, 25 Jan 2013 05:38:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type:x-gm-message-state; bh=OKKuW4ojxr5BhnFO9sIgw1jV6leeoQs8+PoBhlRqIrg=; b=CcmL5P4Mbdne2oDqHWWQ9rVyVaLwuSAHFalvZhK9/o7qIJiVrARbtXj18EsHV9qDR0 cTHAYZm52bKe70CHuBZmhbgmkmfpo6L2g2IQIMFj8J7JLXYVQ4nenLSYbGCqb0kvziAR Aa8QxQrx5awVFkjHeLSJaCPOEweFkT+QwC+/EGtny5SuJe4Dxc7WDtoNPwoBpu0vaspC 5m74Io1mnGMmCsMUPftSFCfPMSWBpSDS1P6ubxQ1YydTz52SESWMeb6W5P/JJLZxC+rr BFp7VdkzeF3nu2b78RuLRpO4Uvv2XMRDcnmGAu++hP8sYenzHG4LrIWFHE+cdu8g4P89 7IJQ== X-Received: by 10.68.222.232 with SMTP id qp8mr14151066pbc.99.1359121094200; Fri, 25 Jan 2013 05:38:14 -0800 (PST) MIME-Version: 1.0 Received: by 10.68.154.73 with HTTP; Fri, 25 Jan 2013 05:37:54 -0800 (PST) In-Reply-To: References: <1359048152.2728.88.camel@linux.scoobydoo> <1359059327.2728.96.camel@linux.scoobydoo> <1359106763.2728.106.camel@linux.scoobydoo> From: Michael McCandless Date: Fri, 25 Jan 2013 08:37:54 -0500 Message-ID: Subject: Re: Faceted search in OR To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQluTN+7P8Hp6DrX9LfiHszbkgSWcl7VKetVE+bVMOe8SQTfaiMUFp8J6Pbz3z/hsQWEOSZB X-Virus-Checked: Checked by ClamAV on apache.org I think that was supposed to be A/1 and A/3 in the last sentence below? But, anyway, I think the question (and it's a good one!) is how, after having drilled down on one of these, eg A/1, would you then still show the counts for the other A/N categories? Ie the counts would show how many hits the user would see if they changed A/1 drilldown to A/N instead. I call this "drill sideways"... Mike McCandless http://blog.mikemccandless.com On Fri, Jan 25, 2013 at 7:29 AM, Shai Erera wrote: > Ooops, I just realized that at some point java-user was removed from the CC > :). > Fixing that. > > Shai > > > On Fri, Jan 25, 2013 at 2:27 PM, Shai Erera wrote: > >> Hi Nicola, >> >> Indeed, if it's a URL with parameters, it's not a UI trick :). I think >> that you can do what you want with the package, but before I explain what I >> think you should do, I'd like to use a concrete example, to better >> understand: >> >> Suppose that you have facets A/1, A/2 ... A/6 associated with documents. A >> document is associated with exactly one "A" facet, but the same facet may >> be associated with many documents. >> You query for X and it matches some documents that are collectively >> associated with facets A/1, A/2, A/3 and A/4. So A/5 and A/6 are associated >> with documents that do not match your query. >> However, your FacetRequest sets its numResults (what we call top-K) to 2, >> so you only get back A/1 and A/3, since they have the highest counts. >> >> So what we have now are: >> * Facets A/1, A/3 returned to the user, since they belong to the result >> set and have the highest counts >> * Facets A/2, A/4 are not returned to the user, even though they belong to >> the result set, but did not make it to the top-K >> * Facets A/5, A/6 are not returned because they don't belong to the result >> set at all. >> >> If this makes sense to you, and is similar to the scenario that you have, >> which of these facets would u like to show in addition to A/1 and A/2? >> >> Shai >> >> >> On Fri, Jan 25, 2013 at 11:39 AM, Nicola Buso wrote: >> >>> Hi Shai, >>> >>> thanks, again you are helping me a lot introducing faceted search. >>> >>> I'm not sure it's a UI trick. Suppose you have a URL with query params >>> that lead you to: >>> - the electronic department >>> - query on "hi-fi" >>> - brand facet selection on "A" >>> >>> which trick should the UI use? As a trick I should immagine: >>> - don't filter on facet with lucene but do it in the UI (now is tricky >>> to do the facet counting without lucene) >>> - execute 2 query one filtered and one not; pick the selected facets >>> from the filtered query and the other from the non filtered one >>> (filtered = filtered by facet selection, we can argue here) >>> >>> Note also I have some services that should return the results together >>> the facets if needed. >>> >>> >>> >>> Nicola. >>> >>> On Thu, 2013-01-24 at 22:47 +0200, Shai Erera wrote: >>> > That's sounds more like a UI trick to me. When I do that, I don't >>> > modify the brand facet (in the UI). I.e., continue to display it, with >>> > the original counts and if the user now wants to filter by A + D, then >>> > your UI somehow allows that (maybe checkboxes). Of if the user wants >>> > to quickly switch from brand A to D, he can do so w/ a single click, >>> > without running the original query again. >>> > >>> > >>> > Shai >>> > >>> > >>> > >>> > On Thu, Jan 24, 2013 at 10:28 PM, Nicola Buso wrote: >>> > Hi Shai, >>> > >>> > the use case is simple. Suppose you want to buy an hi-fi on a >>> > online >>> > shop. Go in the website in the Electronic department and write >>> > "hi-fi" >>> > in the search box, the interface return you lots of results >>> > and a facet >>> > on brands (10 brands values). >>> > You select brand A and the results are filtered accordingly; >>> > suppose now >>> > you want to filter adding to the results the brand D, you >>> > can't because >>> > the filtered results by A don't contain values D for the brand >>> > facet. >>> > >>> > Than how can I retrieve also the facets for the results not >>> > filtered? >>> > I think it's a common use case when you permit to the user to >>> > filter in >>> > OR by facets. >>> > >>> > >>> > Nicola. >>> > >>> > On Thu, 2013-01-24 at 19:36 +0200, Shai Erera wrote: >>> > > Hi Nicola, >>> > > >>> > > >>> > > Regarding the OR drill-down, yes you can construct your own >>> > > BooleanQuery, passing Occur.SHOULD instead of MUST. >>> > Currently >>> > > DrillDown does not help you do that, so you can copy the >>> > code from >>> > > DrillDown.query and change SHOULD to MUST. I opened >>> > LUCENE-4716 to add >>> > > this support to DrillDown. >>> > > >>> > > >>> > > >>> > > Not sure that I understand your second question. If you want >>> > to >>> > > retrieve counts for all descendants of A, then set your >>> > > FR.setNumResults to Integer.MAX_VALUE. But note, it's going >>> > to be >>> > > costly, i.e. you'd get a FacetResultNode per child of A, so >>> > depending >>> > > how "wide" A is, this may have some impact on RAM >>> > consumption. >>> > > >>> > > If that's not what you meant, could you please clarify? >>> > > >>> > > >>> > > Shai >>> > > >>> > > >>> > > >>> > > On Thu, Jan 24, 2013 at 7:22 PM, Nicola Buso >>> > wrote: >>> > > Hi all, >>> > > >>> > > I'm introducing Lucene faceted search in our project >>> > and I >>> > > need some >>> > > hints to achieve some functionalities: >>> > > - I want facet filtering in OR, how to? >>> > > - obtain facets for the filtered results but also >>> > for the >>> > > non filtered >>> > > one. i.e. I have facet A with values A/V1, A/V2, >>> > A/V3 and >>> > > these values >>> > > are disjunct each other, than a document having >>> > field with >>> > > value V1 >>> > > can't have also value V2 and so on; I would like to >>> > let the >>> > > user select >>> > > more of these facet values in OR; how can I >>> > accumulate all the >>> > > facets >>> > > values also filtering by facet selection? Should it >>> > work in a >>> > > way >>> > > similar to ComplementCountingAggregator? >>> > > - Can I use DrillDown class to obtain the OR facet >>> > filtering >>> > > or have I >>> > > to rewrite a similar class using the BooleanQuery in >>> > OR. It's >>> > > not clear >>> > > to me by this comment in the API: >>> > > Wraps a given Query as a drill-down query over the >>> > given >>> > > categories, >>> > > assuming all are required (e.g. AND). You can >>> > construct a >>> > > query with >>> > > different modes (such as OR or AND of ORs) by >>> > creating a >>> > > BooleanQuery >>> > > and call this method several times. Make sure to >>> > wrap the >>> > > query in that >>> > > case by ConstantScoreQuery and set the boost to >>> > 0.0f, so that >>> > > it doesn't >>> > > affect scoring. >>> > > >>> > > >>> > > Do you have any examples doing this? >>> > > >>> > > Regards >>> > > >>> > > Nicola. >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > >>> --------------------------------------------------------------------- >>> > > To unsubscribe, e-mail: >>> > > java-user-unsubscribe@lucene.apache.org >>> > > For additional commands, e-mail: >>> > > java-user-help@lucene.apache.org >>> > > >>> > > >>> > > >>> > >>> > >>> > >>> > >>> >>> >>> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org