lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohsin Beg Beg <mohsin....@oracle.com>
Subject Re: OutOfMemory on 28 docs with facet.method=fc/fcs
Date Tue, 18 Nov 2014 23:43:06 GMT


solrcloud has 8billion+ docs and increasing non-linearly each hour.
numFound=28 was for the faceting query only.

If fieldCache (lucene caches) is the issue, is q=time:[<begin time> TO <end time>]
be better instead ?

-Mohsin



----- Original Message -----
From: apache@elyograg.org
To: solr-user@lucene.apache.org
Sent: Tuesday, November 18, 2014 2:45:46 PM GMT -08:00 US/Canada Pacific
Subject: Re: OutOfMemory on 28 docs with facet.method=fc/fcs

On 11/18/2014 3:06 PM, Mohsin Beg Beg wrote:
> Looking at SimpleFacets.java, doesn't fc/fcs iterate only over the DocSet for the fields.
So assuming each field has a unique term across the 28 rows, a max of 28 * 15 unique small
strings (<100bytes), should be in the order of 1MB. For 100 collections, lets say a total
of 1GB. Now lets say I multiply it by 3 to 3GB. 

Are there 28 documents in the entire index?  It's my understanding that
the fieldcache memory required is not dependent on the number of
documents that match your query (numFound), it's dependent on the number
of documents in the entire index.

If my understanding is correct, once that memory structure is calculated
and stored in the fieldcache, it's available to speed up future facets
on that field, even if the query and filters are different than what was
used the first time.  It doesn't seem as useful for typical use cases to
store a facet cache entry that depends on the specific query.

Thanks,
Shawn

Mime
View raw message