lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Bickerstaff <j...@johnbickerstaff.com>
Subject Re: Result Grouping vs. Collapsing Query Parser -- Can one be deprecated?
Date Wed, 19 Oct 2016 23:11:11 GMT
Thank you for posting that.  I'll be saving it in my "important painful
lessons learned by others" mail folder.

On Oct 19, 2016 4:51 PM, "Mike Lissner" <mlissner@michaeljaylissner.com>
wrote:

> Hi all,
>
> I've had a rotten day today because of Solr. I want to share my experience
> and perhaps see if we can do something to fix this particular situation in
> the future.
>
> Solr currently has two ways to get grouped results (so far!). You can
> either use Result Grouping or you can use the Collapsing Query Parser.
> Result grouping seems like the obvious way to go. It's well documented, the
> parameters are clear, it doesn't use a bunch of weird syntax (ie,
> {!collapse blah=foo}), and it uses the feature name from SQL (so it comes
> up in Google).
>
> OTOH, if you use faceting with result grouping, which I imagine many people
> do, you get terrible performance. In our case it went from subsecond to
> 10-120 seconds for big queries. Insanely bad.
>
> Collapsing Query Parser looks like a good way forward for us, and we'll be
> investigating that, but it uses the Expand component that our library
> doesn't support, to say nothing of the truly bizarre syntax. So this will
> be a fair amount of effort to switch.
>
> I'm curious if there is anything we can do to clean up this situation. What
> I'd really like to do is:
>
> 1. Put a HUGE warning on the Result Grouping docs directing people away
> from the feature if they plan to use faceting (or perhaps directing them
> away no matter what?)
>
> 2. Work towards eliminating one or the other of these features. They're
> nearly completely compatible, except for their syntax and performance. The
> collapsing query parser apparently was only written because the result
> grouping had such bad performance -- In other words, it doesn't exist to
> provide unique features, it exists to be faster than the old way. Maybe we
> can get rid of one or the other of these, taking the best parts from each
> (syntax from Result Grouping, and performance from Collapse Query Parser)?
>
> Thanks,
>
> Mike
>
> PS -- For some extra context, I want to share some other reasons this is
> frustrating:
>
> 1. I just spent a week upgrading a third-party library so it would support
> grouped results, and another week implementing the feature in our code with
> tests and everything. That was a waste.
> 2. It's hard to notice performance issues until after you deploy to a big
> data environment. This creates a bad situation for users until you detect
> it and revert the new features.
> 3. The documentation *could* say something about the fact that a new
> feature was developed to provide better performance for grouping. It could
> say that using facets with groups is an anti-feature. It says neither.
>
> I only mention these because, like others, I've had a real rough time with
> solr (again), and these are the kinds of seemingly small things that could
> have made all the difference.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message