lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simona Russo <simoru...@gmail.com>
Subject Re: Lucene Facets performance problems (version 4.7.2)
Date Sun, 28 Feb 2016 07:25:43 GMT
Thanks for yours quick feedback.

The problem happens by the customer and we trying to simulate it on our
environments in order to figure out which part of the query is slow.

I add some informations:

   - the facet *dimension* is composed by *2 categories* (for example
   "/cat1/cat2") and the *second* category ("cat2") is a *multivalue field*
   - the cardinality is
      - "cat1" is about 15 millions of unique value
      - "cat2" *every* unique *"cat1"* contains maximum 100 documents
and *every
      document* contains an average of 30 values of the field "cat2"
(*multivalue
      field*)
   - we use the following statements to obtain the "facets"

FacetsCollector fc = new FacetsCollector ();
>
FacetsCollector.search ( searcher, query, maxResults, fc );
>
Facets facetsCount = new *FastTaxonomyFacetCounts* (indexFieldName, tReader,
> facetsConfig, fc );
>
...


And after we have a recursive call like this to browse the path of the
category:

> FacetResult facetResult = facetsCount.getTopChildren (topN, dimensionName,
> arrayCurrentPath );



>From the first tests performed in our laboratory, the slowest part seems to
be when we create an instance of new *FastTaxonomyFacetCounts.*
*I don't know if the problem is with category with **multivalue field*.

We are still investigating so I have no other information, but if you have
other questions feel free to contact me.

Thanks
Simona





2016-02-26 10:47 GMT+01:00 Rob Audenaerde <rob.audenaerde@gmail.com>:

> Hi Simona,
>
> In addition to Ericks' questions:
>
> Are you talking about *search* time or facet-collection time? And how many
> results are in your result set?
>
> I have some experience with collecting facets from large results set, these
> are typically slow (as they have to retrieve all the relevant facet fields
> for the facetted doccument). In Lucene 4.8 the
> RandomSamplingFacetsCollector
> returned (as per https://issues.apache.org/jira/browse/LUCENE-5476).
>
> -Rob
>
> On Fri, Feb 26, 2016 at 6:01 AM, Simona Russo <simorusso@gmail.com> wrote:
>
> > Hi all,
> >
> > we use Lucene *Facet* library version* 4.7.2.*
> >
> > We have an *index* with *45 millions *of documents (size about 15 GB)
> and
> > a *taxonomy* index with *57* millions of documents (size about 2 GB).
> >
> > The total *facet search* time achieve *15 seconds*!
> >
> > Is it possible to improve this time? Is there any tips to *configure* the
> > *taxonomy* index to avoid this waste of time?
> >
> >
> > Thanks in advance
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message