lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Krishnamurthy, Kannan" <Kannan.Krishnamur...@contractor.cengage.com>
Subject Huge FacetArrays while using SortedSetDocValuesAccumulator
Date Mon, 26 Aug 2013 20:45:36 GMT
Hello, 

We are working with large lucene 4.3.0 index and using SortedSetDocValuesFacetFields for creating
facets and SortedSetDocValuesAccumulator for facet accumulation. We couldn't use a taxonomy
based facet implementation (We use MultiReader for searching and our indices is composed of
multiple physical lucene indices, hence we cannot have a single taxonomy index). We have two
million categories and expect to have another two million in the near future. As the current
implementation of SortedSetDocValuesAccumulator does not support ReusingFacetArrays, we are
concerned with potential garabage collector related performance issues in our high traffic
application. Will future Lucene release support using ReusingFacetArrays in SortedSetDocValuesAccumulator
?

Also as an alternative we are considering subclassing FacetIndexingParams and provide dimension
specific CategoryListParams during indexing time. This will help to reduce the size of the
FacetArray per facet request. We realize this approach will not support multiple FacetRequest
in a single SortedSetDocValuesAccumulator, as SortedSetDocValuesReaderState hardcodes the
category to null while calling FacetIndexingParams.getCategoryListParams(null) in its constructor.


Are there better approaches to this problem ?


Thanks in advance for any help. 

Kannan
Cengage Learning
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message