lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From digy digy <digyd...@gmail.com>
Subject Re: [Lucene.Net] Faceted Search in Lucene.NET without using the Bits-method
Date Fri, 27 May 2011 06:02:08 GMT
Indeed, I seems to be slow. With 30 facets, ~300K docs and 1.4GB index I get
better results.(first query: ~100-400 ms, if I repeat it then <10ms)
Do you use the code from
https://svn.apache.org/repos/asf/incubator/lucene.net/branches/Lucene.Net_2_9_4g/src/contrib/SimpleFacetedSearch/
 ?

DIGY


On Fri, May 27, 2011 at 12:49 AM, Marco Dissel <marco.dissel@gmail.com>wrote:

> You're are right.. It this the most optimal way of calculating multiple
> facets?
>
> I get this performance result in my test:
> seconds - number of facet fields
> 0,2760158 - 1
> 0,665038 - 2
> 1,7320991 - 3
> 6,0633468 - 4
>
> on a ~300mb index
>
> Ideally i want to show an user interface with much more facets, but with
> these numbers it's not possible.
>
> On Thu, May 26, 2011 at 8:10 AM, digy digy <digydigy@gmail.com> wrote:
>
> > I still don't understand, since these numbers can easily be obtained from
> > current implementation.
> > Just count HitsPerGroup.Name[i]' s.
> >
> > DIGY
> >
> > On Thu, May 26, 2011 at 12:21 AM, Marco Dissel <marco.dissel@gmail.com
> > >wrote:
> >
> > > Let's take the sample data from http://goo.gl/UQubj
> > >
> > >          f1     f2     f3
> > >          --     --     --
> > > doc1      A      I      1
> > > doc2      A      I      2
> > > doc3      A      I      3
> > > doc4      A      J      1
> > > doc5      A      J      2
> > > doc6      A      J      3
> > > doc7      B      I      1
> > >
> > > SimpleFacetedSearch sfs = new SimpleFacetedSearch(_Reader, new
> > > string[] {"f1", "f2", "f3"});
> > > SimpleFacetedSearch.Hits hits = sfs.Search(query);
> > >
> > > foreach(Facet facet in hits.Facets){ // groupByFields from constructor
> > >
> > >   foreach(FacetTerm facetTerm in facet.FacetTerms){
> > >
> > >       WriteLine(facet.Field + facetTerm.Name + facetTerm.HitCount);
> > >
> > >   }
> > >
> > > }
> > >
> > > results in:
> > > f1 - A - 6
> > > f1 - B - 1
> > > f2 - I - 4
> > > f2 - J - 3
> > > f3 - 1 - 3
> > > f3 - 2 - 1
> > > f3 - 3 - 3
> > >
> > > >> A *simple* example?
> > >
> > > I didn't say it's easy, the opposite..again thanks for your time.
> > >
> > > On Wed, May 25, 2011 at 10:55 PM, Digy <digydigy@gmail.com> wrote:
> > >
> > > >
> > > > Sorry Marco,
> > > > I can't see your point. A *simple* example?
> > > >
> > > > DIGY
> > > >
> > > > -----Original Message-----
> > > > From: Marco Dissel [mailto:marco.dissel@gmail.com]
> > > > Sent: Wednesday, May 25, 2011 11:45 PM
> > > > To: lucene-net-user@lucene.apache.org
> > > > Subject: Re: [Lucene.Net] Faceted Search in Lucene.NET without using
> > the
> > > > Bits-method
> > > >
> > > > Thank digy!
> > > >
> > > > The current sample is grouping the unique combinations of multiple
> > facets
> > > > fields, but what if you want to create an facet interface like on
> this
> > > > amazon.com page http://goo.gl/tBhuj ? The number of hits are based
> on
> > a
> > > > single field query (click on "boxed set" will show 1148 hits)..
> > > >
> > > > Edition
> > > >
> > > >   - [image: Unselected Box]Boxed Set
> > > > (1,148)<
> > > > http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%
> > > >
> > > >
> > >
> >
> 2C672573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q
> > > >
> > > >
> > >
> >
> 81K79MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_fo
> > > >
> > > >
> > >
> >
> rmat_browse-bi_0?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A12
> > > >
> > > >
> > >
> >
> 917411%2Cn%3A672573011%2Cn%3A163450%2Cp_n_format_browse-bin%3A388395011&bbn=
> > > > 672573011&ie=UTF8&qid=1306356036&rnid=390118011>
> > > >   - [image: Unselected Box]Full Screen
> > > > (875)<
> > > >
> http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%2C
> > > >
> > > >
> > >
> >
> 672573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q81
> > > >
> > > >
> > >
> >
> K79MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_form
> > > >
> > > >
> > >
> >
> at_browse-bi_1?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A1291
> > > >
> > > >
> > >
> >
> 7411%2Cn%3A672573011%2Cn%3A163450%2Cp_n_format_browse-bin%3A390120011&bbn=67
> > > > 2573011&ie=UTF8&qid=1306356036&rnid=390118011>
> > > >   - [image: Unselected Box]Widescreen
> > > > (606)<
> > > >
> http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%2C
> > > >
> > > >
> > >
> >
> 672573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q81
> > > >
> > > >
> > >
> >
> K79MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_form
> > > >
> > > >
> > >
> >
> at_browse-bi_2?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A1291
> > > >
> > > >
> > >
> >
> 7411%2Cn%3A672573011%2Cn%3A163450%2Cp_n_format_browse-bin%3A390121011&bbn=67
> > > > 2573011&ie=UTF8&qid=1306356036&rnid=390118011>
> > > >
> > > > Region
> > > >
> > > >   - US & CA DVDs: Region 1
> > > > (1,873)<
> > > > http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%
> > > >
> > > >
> > >
> >
> 2C672573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q
> > > >
> > > >
> > >
> >
> 81K79MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_fe
> > > >
> > > >
> > >
> >
> ature_two_brow_0?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A12
> > > >
> > > >
> > >
> >
> 917411%2Cn%3A672573011%2Cn%3A163450%2Cp_n_feature_two_browse-bin%3A405391011
> > > > &bbn=672573011&ie=UTF8&qid=1306356036&rnid=405390011>
> > > >   - DVDs Playable Outside the US
> > > > (1)<
> > > >
> > http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%2C67
> > > >
> > > >
> > >
> >
> 2573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q81K7
> > > >
> > > >
> > >
> >
> 9MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_featur
> > > >
> > > >
> > >
> >
> e_two_brow_1?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A129174
> > > >
> > > >
> > >
> >
> 11%2Cn%3A672573011%2Cn%3A163450%2Cp_n_feature_two_browse-bin%3A405393011&bbn
> > > > =672573011&ie=UTF8&qid=1306356036&rnid=405390011>
> > > >   - DVDs Playable in any Region
> > > > (62)<
> > > >
> > http://www.amazon.com/s/ref=amb_link_85221991_4?ie=UTF8&node=163450%2C6
> > > >
> > > >
> > >
> >
> 72573011&bbn=672573011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=left-1&pf_rd_r=03F2Q81K
> > > >
> > > >
> > >
> >
> 79MYNMMCJEZZ&pf_rd_t=101&pf_rd_p=1299020482&pf_rd_i=130#/ref=sr_nr_p_n_featu
> > > >
> > > >
> > >
> >
> re_two_brow_2?rh=n%3A130%2Cn%3A%212334112011%2Cn%3A%212334174011%2Cn%3A12917
> > > >
> > > >
> > >
> >
> 411%2Cn%3A672573011%2Cn%3A163450%2Cp_n_feature_two_browse-bin%3A405392011&bb
> > > > n=672573011&ie=UTF8&qid=1306356036&rnid=405390011>
> > > >
> > > >
> > > > Thanks
> > > >
> > > > On Wed, May 25, 2011 at 12:08 AM, Digy <digydigy@gmail.com> wrote:
> > > >
> > > > > In case of you don't follow the dev list:
> > > > > https://issues.apache.org/jira/browse/LUCENENET-415
> > > > > SimleFacetedSearch2.cs + TestSimleFacetedSearch2.cs
> > > > >
> > > > > DIGY
> > > > >
> > > > > -----Original Message-----
> > > > > From: Floyd Wu [mailto:floyd.wu@gmail.com]
> > > > > Sent: Tuesday, May 24, 2011 6:40 AM
> > > > > To: lucene-net-user@lucene.apache.org
> > > > > Subject: Re: [Lucene.Net] Faceted Search in Lucene.NET without
> using
> > > the
> > > > > Bits-method
> > > > >
> > > > > Hi DIGY,
> > > > >
> > > > > How  can I use your sample to accomplish faceted search just like
> > SOLR
> > > > > provided.
> > > > >
> > > > > for example:
> > > > >  I would like to facet for 3 fields "category" /  "author" /
> > "folder".
> > > > >  In your example,  only faceted field "cat", should I do the same
> > > > procedure
> > > > > three times to accomplish my requirement or do you have better way?
> > > > >
> > > > > Many thanks
> > > > >
> > > > > Floyd
> > > > >
> > > > >
> > > > > 2011/5/20 digy digy <digydigy@gmail.com>
> > > > >
> > > > > >            List<string> categories = new List<string>();
> > > > > >
> > > > > >      TermEnum te = reader.Terms(new Term("cat",""));
> > > > > >      categories.Add(te.Term().Text());
> > > > > >
> > > > > >      while (te.Next())
> > > > > >      {
> > > > > >          if (te.Term().Field() != "cat") break;
> > > > > >          categories.Add(te.Term().Text());
> > > > > >      }
> > > > > >
> > > > > > DIGY
> > > > > >
> > > > > > On Fri, May 20, 2011 at 2:28 PM, Marco Dissel <
> > > marco.dissel@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Thanks digy! Can you extend the sample to show us how you
would
> > get
> > > > the
> > > > > > > results to be used in a facet User interface, if you don't
know
> > the
> > > > > facet
> > > > > > > terms in advance, including the number of hits.
> > > > > > >
> > > > > > > PseudoCode:
> > > > > > >
> > > > > > > Query query = new QueryParser("text", new
> > > > > > > StandardAnalyzer()).Parse(searchText);
> > > > > > >
> > > > > > > IDictionary<string,long> categoryFacets = GetFacets("category",
> > > > query,
> > > > > > > numberOfMinimalHits); // only return items with a minimal
> > hitcount.
> > > > > > >
> > > > > > > IDictionary<string,long> anotherFieldFacets =
> > > > > > > GetFacets("anotherfacetfield", query, numberOfMinimalHits);
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > >
> > > > > > > On Fri, May 20, 2011 at 1:01 PM, digy digy <digydigy@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > I prepared a sample. Maybe, this can help
> > > > > > > >
> > > > > > > > http://people.apache.org/~digy/FacetedSearch.cs
> > > > > > > >
> > > > > > > > DIGY
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, May 20, 2011 at 10:43 AM, K a r n a v <
> > > > > > karunakerreddyv@gmail.com
> > > > > > > > >wrote:
> > > > > > > >
> > > > > > > > > I've tried with OpenBitSetDISI...its not a performance
> > > effective
> > > > > > > one....
> > > > > > > > > but
> > > > > > > > > able generate results with BitArray fast enough....
> > > > > > > > > ... check the demo app
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> >
> http://demo.wisestepp.com/jobs/JobSearchResults.aspx?am9ibmFtZSxqYXZh-yCfmbU
> > > > > TwmeY%3d
> > > > > > > > >
> > > > > > > > > If you have an example or sample code to work
with
> > > > > > > OpenBitSetDISI..please
> > > > > > > > > fwd it to me...
> > > > > > > > >
> > > > > > > > > Thank you in advance.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Karunaker Reddy V
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, May 20, 2011 at 1:05 PM, digy digy <
> > digydigy@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Take a look at OpenBitSetDISI in Lucene.Net.Util
> > > > > > > > > >
> > > > > > > > > > DIGY
> > > > > > > > > >
> > > > > > > > > > On Fri, May 20, 2011 at 10:20 AM, lars aslin
<
> > > > > larsaaslin@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi
> > > > > > > > > > > Im building a search function with
Lucene.NET where I
> > want
> > > > > > faceted
> > > > > > > > > > > searches to be supported. To accomplish
this in an
> > > efficient
> > > > > way
> > > > > > I
> > > > > > > > > > > thought a good idea would be to have
a filter for each
> > > > category
> > > > > > and
> > > > > > > > > > > then use the AND-operation between
each
> category-filters
> > > > > BitArray
> > > > > > > and
> > > > > > > > > > > the BitArray for the search result,
thus getting the
> > search
> > > > > > result
> > > > > > > > > > > grouped by category by just using the
AND-operation.
> > Heres
> > > a
> > > > > > > > blog-post
> > > > > > > > > > > describing it better
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> >
> http://www.devatwork.nl/articles/lucenenet/faceted-search-and-drill-down-luc
> > > > > enenet/
> > > > > > > > > > > .
> > > > > > > > > > >
> > > > > > > > > > > Now, I see that the Bits-method is
tagged as "obsolete"
> > in
> > > > the
> > > > > > API
> > > > > > > so
> > > > > > > > > > > I guess it will be removed sometime
in the future. So
> is
> > > > there
> > > > > > > > another
> > > > > > > > > > > way of accomplish faceted searches
in Lucene in a
> simular
> > > way
> > > > > or
> > > > > > > > > > > should stick to the Bits-method to
receive a BitArray
> > > anyway.
> > > > > > > > > > > WebRep
> > > > > > > > > > > Overall rating
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message