lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <benedetti.ale...@gmail.com>
Subject Re: faceting is unusable slow since upgrade to 5.3.0
Date Thu, 24 Sep 2015 14:16:34 GMT
Yonik, I am really excited about the Json faceting module.
I find it really interesting.
Is there any pros/cons in using them, or it's definitely the "approach of
the future" ?
I saw your benchmarks and seems impressive.

I have not read all the topic in details, just briefly, but is Json
faceting using different faceting algorithms from the standard ones ? (
Enum and fc)
I can not find the algorithm parameter to be passed in the Json facets.
Are they using a complete different approach ?
Is the algorithm used expressed anywhere ?
This could give very good insights on when to use them.

Cheers

2015-09-24 14:58 GMT+01:00 Yonik Seeley <yseeley@gmail.com>:

> On Mon, Sep 21, 2015 at 8:09 AM, Uwe Reh <reh@hebis.uni-frankfurt.de>
> wrote:
> > our bibliographic index (~20M entries) runs fine with Solr 4.10.3
> > With Solr 5.3 faceted searching is constantly incredibly slow (~ 20
> seconds)
> [...]
> >
> > The 'fieldValueCache' seems to be unused (no inserts nor lookups) in Solr
> > 5.3. In Solr 4.10 the 'fieldValueCache' is in heavy use with a
> > cumulative_hitratio of 1.
>
>
> Indeed.  Use of the fieldValueCache (UnInvertedField) was secretly
> removed as part of LUCENE-5666, causing these performance regressions.
>
> This code had been evolved over years to be very fast for specific use
> cases.  No one facet algorithm is going to be optimal for everyone, so
> it's important we have multiple.  But use of the UnInvertedField was
> removed without any notification or discussion whatsoever (and
> obviously no benchmarking), and was only discovered later by Solr devs
> in SOLR-7190 that it was essentially dead code.
>
>
> When I brought back my "JSON Facet API" work to Solr (which was based
> on 4.10.x) it came with a heavily modified version of UnInvertedField
> that is available via the JSON Facet API.  It might currently work
> better for your usecase.
>
> On your normal (non-docValues) index, you can try something like the
> following to see what the performance would be:
>
> $ curl http://yxz/solr/hebis/query -d 'q=darwin&
> json.facet={
>   authors : { type:terms, field:author_facet, limit:30 },
>   material_access : { type:terms, field:material_access, limit:30 },
>   material_brief : { type:terms, field:material_brief, limit:30 },
>   rvk : { type:terms, field:rvk_facet, limit:30 },
>   lang : { type:terms, field:language, limit:30 },
>   dept : { type:terms, field:department_3, limit:30 }
> }'
>
> There were other changes in LUCENE-5666 that will probably slow down
> faceting on the single valued fields as well (so this may still be a
> little slower than 4.10.x), but hopefully it would be more
> competitive.
>
> -Yonik
>



-- 
--------------------------

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message