lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco Dissel <marco.dis...@gmail.com>
Subject Re: [Lucene.Net] Faceted Search in Lucene.NET without using the Bits-method
Date Thu, 26 May 2011 21:49:57 GMT
You're are right.. It this the most optimal way of calculating multiple
facets?

I get this performance result in my test:
seconds - number of facet fields
0,2760158 - 1
0,665038 - 2
1,7320991 - 3
6,0633468 - 4

on a ~300mb index

Ideally i want to show an user interface with much more facets, but with
these numbers it's not possible.

On Thu, May 26, 2011 at 8:10 AM, digy digy <digydigy@gmail.com> wrote:

> I still don't understand, since these numbers can easily be obtained from
> current implementation.
> Just count HitsPerGroup.Name[i]' s.
>
> DIGY
>
> On Thu, May 26, 2011 at 12:21 AM, Marco Dissel <marco.dissel@gmail.com
> >wrote:
>
> > Let's take the sample data from http://goo.gl/UQubj
> >
> >          f1     f2     f3
> >          --     --     --
> > doc1      A      I      1
> > doc2      A      I      2
> > doc3      A      I      3
> > doc4      A      J      1
> > doc5      A      J      2
> > doc6      A      J      3
> > doc7      B      I      1
> >
> > SimpleFacetedSearch sfs = new SimpleFacetedSearch(_Reader, new
> > string[] {"f1", "f2", "f3"});
> > SimpleFacetedSearch.Hits hits = sfs.Search(query);
> >
> > foreach(Facet facet in hits.Facets){ // groupByFields from constructor
> >
> >   foreach(FacetTerm facetTerm in facet.FacetTerms){
> >
> >       WriteLine(facet.Field + facetTerm.Name + facetTerm.HitCount);
> >
> >   }
> >
> > }
> >
> > results in:
> > f1 - A - 6
> > f1 - B - 1
> > f2 - I - 4
> > f2 - J - 3
> > f3 - 1 - 3
> > f3 - 2 - 1
> > f3 - 3 - 3
> >
> > >> A *simple* example?
> >
> > I didn't say it's easy, the opposite..again thanks for your time.
> >
> > On Wed, May 25, 2011 at 10:55 PM, Digy <digydigy@gmail.com> wrote:
> >
> > >
> > > Sorry Marco,
> > > I can't see your point. A *simple* example?
> > >
> > > DIGY
> > >
> > > -----Original Message-----
> > > From: Marco Dissel [mailto:marco.dissel@gmail.com]
> > > Sent: Wednesday, May 25, 2011 11:45 PM
> > > To: lucene-net-user@lucene.apache.org
> > > Subject: Re: [Lucene.Net] Faceted Search in Lucene.NET without using
> the
> > > Bits-method
> > >
> > > Thank digy!
> > >
> > > The current sample is grouping the unique combinations of multiple
> facets
> > > fields, but what if you want to create an facet interface like on this
> > > amazon.com page http://goo.gl/tBhuj ? The number of hits are based on
> a
> > > single field query (click on "boxed set" will show 1148 hits)..
> > >
> > > Edition
> > >
> > >   - [image: Unselected Box]Boxed Set
> > > (1,148)<
> > > http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%
> > >
> > >
> >
> 2C672573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q
> > >
> > >
> >
> 81K79MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_fo
> > >
> > >
> >
> rmat_browse-bi_0?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A12
> > >
> > >
> >
> 917411%2Cn%3A672573011%2Cn%3A163450%2Cp_n_format_browse-bin%3A388395011&bbn=
> > > 672573011&ie=UTF8&qid=1306356036&rnid=390118011>
> > >   - [image: Unselected Box]Full Screen
> > > (875)<
> > > http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%2C
> > >
> > >
> >
> 672573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q81
> > >
> > >
> >
> K79MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_form
> > >
> > >
> >
> at_browse-bi_1?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A1291
> > >
> > >
> >
> 7411%2Cn%3A672573011%2Cn%3A163450%2Cp_n_format_browse-bin%3A390120011&bbn=67
> > > 2573011&ie=UTF8&qid=1306356036&rnid=390118011>
> > >   - [image: Unselected Box]Widescreen
> > > (606)<
> > > http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%2C
> > >
> > >
> >
> 672573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q81
> > >
> > >
> >
> K79MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_form
> > >
> > >
> >
> at_browse-bi_2?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A1291
> > >
> > >
> >
> 7411%2Cn%3A672573011%2Cn%3A163450%2Cp_n_format_browse-bin%3A390121011&bbn=67
> > > 2573011&ie=UTF8&qid=1306356036&rnid=390118011>
> > >
> > > Region
> > >
> > >   - US & CA DVDs: Region 1
> > > (1,873)<
> > > http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%
> > >
> > >
> >
> 2C672573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q
> > >
> > >
> >
> 81K79MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_fe
> > >
> > >
> >
> ature_two_brow_0?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A12
> > >
> > >
> >
> 917411%2Cn%3A672573011%2Cn%3A163450%2Cp_n_feature_two_browse-bin%3A405391011
> > > &bbn=672573011&ie=UTF8&qid=1306356036&rnid=405390011>
> > >   - DVDs Playable Outside the US
> > > (1)<
> > >
> http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%2C67
> > >
> > >
> >
> 2573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q81K7
> > >
> > >
> >
> 9MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_featur
> > >
> > >
> >
> e_two_brow_1?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A129174
> > >
> > >
> >
> 11%2Cn%3A672573011%2Cn%3A163450%2Cp_n_feature_two_browse-bin%3A405393011&bbn
> > > =672573011&ie=UTF8&qid=1306356036&rnid=405390011>
> > >   - DVDs Playable in any Region
> > > (62)<
> > >
> http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%2C6
> > >
> > >
> >
> 72573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q81K
> > >
> > >
> >
> 79MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_featu
> > >
> > >
> >
> re_two_brow_2?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A12917
> > >
> > >
> >
> 411%2Cn%3A672573011%2Cn%3A163450%2Cp_n_feature_two_browse-bin%3A405392011&bb
> > > n=672573011&ie=UTF8&qid=1306356036&rnid=405390011>
> > >
> > >
> > > Thanks
> > >
> > > On Wed, May 25, 2011 at 12:08 AM, Digy <digydigy@gmail.com> wrote:
> > >
> > > > In case of you don't follow the dev list:
> > > > https://issues.apache.org/jira/browse/LUCENENET-415
> > > > SimleFacetedSearch2.cs + TestSimleFacetedSearch2.cs
> > > >
> > > > DIGY
> > > >
> > > > -----Original Message-----
> > > > From: Floyd Wu [mailto:floyd.wu@gmail.com]
> > > > Sent: Tuesday, May 24, 2011 6:40 AM
> > > > To: lucene-net-user@lucene.apache.org
> > > > Subject: Re: [Lucene.Net] Faceted Search in Lucene.NET without using
> > the
> > > > Bits-method
> > > >
> > > > Hi DIGY,
> > > >
> > > > How  can I use your sample to accomplish faceted search just like
> SOLR
> > > > provided.
> > > >
> > > > for example:
> > > >  I would like to facet for 3 fields "category" /  "author" /
> "folder".
> > > >  In your example,  only faceted field "cat", should I do the same
> > > procedure
> > > > three times to accomplish my requirement or do you have better way?
> > > >
> > > > Many thanks
> > > >
> > > > Floyd
> > > >
> > > >
> > > > 2011/5/20 digy digy <digydigy@gmail.com>
> > > >
> > > > >            List<string> categories = new List<string>();
> > > > >
> > > > >      TermEnum te = reader.Terms(new Term("cat",""));
> > > > >      categories.Add(te.Term().Text());
> > > > >
> > > > >      while (te.Next())
> > > > >      {
> > > > >          if (te.Term().Field() != "cat") break;
> > > > >          categories.Add(te.Term().Text());
> > > > >      }
> > > > >
> > > > > DIGY
> > > > >
> > > > > On Fri, May 20, 2011 at 2:28 PM, Marco Dissel <
> > marco.dissel@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Thanks digy! Can you extend the sample to show us how you would
> get
> > > the
> > > > > > results to be used in a facet User interface, if you don't know
> the
> > > > facet
> > > > > > terms in advance, including the number of hits.
> > > > > >
> > > > > > PseudoCode:
> > > > > >
> > > > > > Query query = new QueryParser("text", new
> > > > > > StandardAnalyzer()).Parse(searchText);
> > > > > >
> > > > > > IDictionary<string,long> categoryFacets = GetFacets("category",
> > > query,
> > > > > > numberOfMinimalHits); // only return items with a minimal
> hitcount.
> > > > > >
> > > > > > IDictionary<string,long> anotherFieldFacets =
> > > > > > GetFacets("anotherfacetfield", query, numberOfMinimalHits);
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > >
> > > > > > On Fri, May 20, 2011 at 1:01 PM, digy digy <digydigy@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > I prepared a sample. Maybe, this can help
> > > > > > >
> > > > > > > http://people.apache.org/~digy/FacetedSearch.cs
> > > > > > >
> > > > > > > DIGY
> > > > > > >
> > > > > > >
> > > > > > > On Fri, May 20, 2011 at 10:43 AM, K a r n a v <
> > > > > karunakerreddyv@gmail.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > I've tried with OpenBitSetDISI...its not a performance
> > effective
> > > > > > one....
> > > > > > > > but
> > > > > > > > able generate results with BitArray fast enough....
> > > > > > > > ... check the demo app
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> http://demo.wisestepp.com/jobs/JobSearchResults.aspx?am9ibmFtZSxqYXZh-yCfmbU
> > > > TwmeY%3d
> > > > > > > >
> > > > > > > > If you have an example or sample code to work with
> > > > > > OpenBitSetDISI..please
> > > > > > > > fwd it to me...
> > > > > > > >
> > > > > > > > Thank you in advance.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Karunaker Reddy V
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, May 20, 2011 at 1:05 PM, digy digy <
> digydigy@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Take a look at OpenBitSetDISI in Lucene.Net.Util
> > > > > > > > >
> > > > > > > > > DIGY
> > > > > > > > >
> > > > > > > > > On Fri, May 20, 2011 at 10:20 AM, lars aslin
<
> > > > larsaaslin@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi
> > > > > > > > > > Im building a search function with Lucene.NET
where I
> want
> > > > > faceted
> > > > > > > > > > searches to be supported. To accomplish
this in an
> > efficient
> > > > way
> > > > > I
> > > > > > > > > > thought a good idea would be to have a filter
for each
> > > category
> > > > > and
> > > > > > > > > > then use the AND-operation between each
category-filters
> > > > BitArray
> > > > > > and
> > > > > > > > > > the BitArray for the search result, thus
getting the
> search
> > > > > result
> > > > > > > > > > grouped by category by just using the AND-operation.
> Heres
> > a
> > > > > > > blog-post
> > > > > > > > > > describing it better
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> http://www.devatwork.nl/articles/lucenenet/faceted-search-and-drill-down-luc
> > > > enenet/
> > > > > > > > > > .
> > > > > > > > > >
> > > > > > > > > > Now, I see that the Bits-method is tagged
as "obsolete"
> in
> > > the
> > > > > API
> > > > > > so
> > > > > > > > > > I guess it will be removed sometime in the
future. So is
> > > there
> > > > > > > another
> > > > > > > > > > way of accomplish faceted searches in Lucene
in a simular
> > way
> > > > or
> > > > > > > > > > should stick to the Bits-method to receive
a BitArray
> > anyway.
> > > > > > > > > > WebRep
> > > > > > > > > > Overall rating
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message