lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Solr User <solr...@gmail.com>
Subject Re: Faceting and Grouping Performance Degradation in Solr 5
Date Mon, 26 Sep 2016 21:59:30 GMT
Thanks again for your work on honoring the facet.method.  I have an
observation that I would like to share and get your feedback on if possible.

I performance tested Solr 5.5.2 with various facet queries and the only way
I get comparable results to Solr 4.8.1 is when I expungeDeletes.  Is it
possible that Solr 5 is not as efficiently ignoring deletes as Solr 4?
Here are the details.

Scenario #1:  Using facet.method=uif with faceting on several multi-valued
fields.
4.8.1 (with deletes): 115 ms
5.5.2 (with deletes): 155 ms
5.5.2 (without deletes): 125 ms
5.5.2 (1 segment without deletes): 44 ms

Scenario #2:  Using facet.method=enum with faceting on several multi-valued
fields.  These fields are different than Scenario #1 and perform much
better with enum hence that method is used instead.
4.8.1 (with deletes): 38 ms
5.5.2 (with deletes): 49 ms
5.5.2 (without deletes): 42 ms
5.5.2 (1 segment without deletes): 34 ms



On Tue, May 31, 2016 at 11:57 AM, Alessandro Benedetti <
abenedetti@apache.org> wrote:

> Interesting developments :
>
> https://issues.apache.org/jira/browse/SOLR-9176
>
> I think we found why term Enum seems slower in recent Solr !
> In our case it is likely to be related to the commit I mention in the Jira.
> Have a check Joel !
>
> On Wed, May 25, 2016 at 12:30 PM, Alessandro Benedetti <
> abenedetti@apache.org> wrote:
>
> > I am investigating this scenario right now.
> > I can confirm that the enum slowness is in Solr 6.0 as well.
> > And I agree with Joel, it seems to be un-related with the famous faceting
> > regression :(
> >
> > Furthermore with the legacy facet approach, if you set docValues for the
> > field you are not going to be able to try the enum approach anymore.
> >
> > org/apache/solr/request/SimpleFacets.java:448
> >
> > if (method == FacetMethod.ENUM && sf.hasDocValues()) {
> >   // only fc can handle docvalues types
> >   method = FacetMethod.FC;
> > }
> >
> >
> > I got really horrible regressions simply using term enum in both Solr 4
> > and Solr 6.
> >
> > And even the most optimized fcs approach with docValues and
> > facet.threads=nCore does not perform as the simple enum in Solr 4 .
> >
> > i.e.
> >
> > For some sample queries I have 40 ms vs 160 ms and similar...
> > I think we should open an issue if we can confirm it is not related with
> > the other.
> > A lot of people will continue using the legacy approach for a while...
> >
> > On Wed, May 18, 2016 at 10:42 PM, Joel Bernstein <joelsolr@gmail.com>
> > wrote:
> >
> >> The enum slowness is interesting. It would appear on the surface to not
> be
> >> related to the FieldCache issue. I don't think the main emphasis of the
> >> JSON facet API has been the enum approach. You may find using the JSON
> >> facet API and eliminating the use of enum meets your performance needs.
> >>
> >> With the CollapsingQParserPlugin top_fc is definitely faster during
> >> queries. The tradeoff is slower warming times and increased memory usage
> >> if
> >> the collapse fields are used in faceting, as faceting will load the
> field
> >> into a different cache.
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >> On Wed, May 18, 2016 at 5:28 PM, Solr User <solrcal@gmail.com> wrote:
> >>
> >> > Joel,
> >> >
> >> > Thank you for taking the time to respond to my question.  I tried the
> >> JSON
> >> > Facet API for one query that uses facet.method=enum (since this one
> has
> >> a
> >> > ton of unique values and performed better with enum) but this was way
> >> > slower than even the slower Solr 5 times.  I did not try the new API
> >> with
> >> > the non-enum queries though so I will give that a go.  It looks like
> >> Solr
> >> > 5.5.1 also has a facet.method=uif which will be interesting to try.
> >> >
> >> > If these do not prove helpful, it looks like I will need to wait for
> >> > SOLR-8096 to be resolved before upgrading.
> >> >
> >> > Thanks also for your comment on top_fc for the CollapsingQParser.  I
> use
> >> > collapse/expand for some queries but traditional grouping for others
> >> due to
> >> > performance.  It will be interesting to see if those grouping queries
> >> > perform better now using CollapsingQParser with top_fc.
> >> >
> >> > On Wed, May 18, 2016 at 11:39 AM, Joel Bernstein <joelsolr@gmail.com>
> >> > wrote:
> >> >
> >> > > Yes, SOLR-8096 is the issue here.
> >> > >
> >> > > I don't believe indexing with docValues is going to help too much
> with
> >> > > this. The enum slowness may not be related, but I'm not positive
> about
> >> > > that.
> >> > >
> >> > > The major slowdowns are likely due to the removal of the top level
> >> > > FieldCache from general use and the removal of the FieldValuesCache
> >> which
> >> > > was used for multi-value field faceting.
> >> > >
> >> > > The JSON facet API covers all the functionality in the traditional
> >> > > faceting, and it has been developed to be very performant.
> >> > >
> >> > > You may also want to see if Collapse/Expand can meet your
> applications
> >> > > needs rather Grouping. It allows you to specify using a top level
> >> > > FieldCache if performance is a blocker without it.
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > Joel Bernstein
> >> > > http://joelsolr.blogspot.com/
> >> > >
> >> > > On Wed, May 18, 2016 at 10:42 AM, Solr User <solrcal@gmail.com>
> >> wrote:
> >> > >
> >> > > > Does anyone know the answer to this?
> >> > > >
> >> > > > On Wed, May 4, 2016 at 2:19 PM, Solr User <solrcal@gmail.com>
> >> wrote:
> >> > > >
> >> > > > > I recently was attempting to upgrade from Solr 4.8.1 to
Solr
> 5.4.1
> >> > but
> >> > > > had
> >> > > > > to abort due to average response times degraded from a baseline
> >> > volume
> >> > > > > performance test.  The affected queries involved faceting
(both
> >> enum
> >> > > > method
> >> > > > > and default) and grouping.  There is a critical bug
> >> > > > > https://issues.apache.org/jira/browse/SOLR-8096 currently
open
> >> > which I
> >> > > > > gather is the cause of the slower response times.  One concern
I
> >> have
> >> > > is
> >> > > > > that discussions around the issue offer the suggestion of
> indexing
> >> > with
> >> > > > > docValues which alleviated the problem in at least that
one
> >> reported
> >> > > > case.
> >> > > > > However, indexing with docValues did not improve the performance
> >> in
> >> > my
> >> > > > case.
> >> > > > >
> >> > > > > Can someone please confirm or correct my understanding that
this
> >> > issue
> >> > > > has
> >> > > > > no path forward at this time and specifically that it is
already
> >> > known
> >> > > > that
> >> > > > > docValues does not necessarily solve this?
> >> > > > >
> >> > > > > Thanks in advance!
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message