lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Lin Edwin Yeo <edwinye...@gmail.com>
Subject Re: Difference between Legacy Facets and JSON Facets
Date Thu, 03 Sep 2015 03:04:59 GMT
> As far as I can see, JSON Facets does not have this delayed mapping
mechanism: Every increment requires a call to the segment->global-ordinal
map. With a large field this map cannot be in the fast caches. Combine this
with a gazillion references and it makes sense that JSON Facets is slower
in this scenario. A factor 20 sounds like way too much though. I would have
expected maybe 2.


I'm not sure if it is the really large content that causes this.
I have found some other fields, if I indexed them as String and the length
is more than 5 different words, the JSON facet is slightly slower than
Legacy facet, but that is within your expected factor of 2. (Legacy Facet
QTime:10, JSON Facet QTime:25)

The content is the only one with a factor of more than 20, as some of the
documents indexed is more than 200 pages long.

So should I say that in this case of doing faceting on large content field,
using Legacy Facet is better than using the newer JSON Facet? But for other
shorter field, using JSON Facet would be better?


Regards,
Edwin


On 3 September 2015 at 02:44, Toke Eskildsen <te@statsbiblioteket.dk> wrote:

> Yonik Seeley <yseeley@gmail.com> wrote:
> > Hmmm, well something is really wrong for this orders of magnitude
> > difference.  I've never seen anything like that and we should
> > definitely try to get to the bottom of it.
>
> This might be a wild goose chase, but...
>
> Zheng states it is a text field with the content of fairly large
> documents. This means a high amount of unique values and a gazillion
> references from documents to those values.
>
> When incrementing counters for String faceting, segment ordinal -> index
> ordinal mapping takes place. Legacy facets has a mechanism where temporary
> segment-specific counters are used. These are updated directly with the
> segment ordinals and the mapping to global ordinals is performed after the
> counting.
>
> As far as I can see, JSON Facets does not have this delayed mapping
> mechanism: Every increment requires a call to the segment->global-ordinal
> map. With a large field this map cannot be in the fast caches. Combine this
> with a gazillion references and it makes sense that JSON Facets is slower
> in this scenario. A factor 20 sounds like way too much though. I would have
> expected maybe 2.
>
> - Toke Eskildsen
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message