lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <msoko...@safaribooksonline.com>
Subject Re: How to configure Solr PostingsFormat block size
Date Mon, 12 Jan 2015 20:54:11 GMT
It looks like this is a good starting point:

http://wiki.apache.org/solr/SolrConfigXml#codecFactory

-Mike

On 01/12/2015 03:37 PM, Tom Burton-West wrote:
> Hello all,
>
> Our indexes have around 3 billion unique terms, so for Solr 3, we set
> TermIndexInterval to about 8 times the default.  The net effect of this is
> to reduce the size of the in-memory index by about 1/8th.  (For background
> see for
> http://www.hathitrust.org/blogs/large-scale-search/too-many-words-again, )
>
> We would like to do something similar for Solr4.   T
>
> he Lucene 4.10.2 JavaDoc for setTermIndexInterval suggests how this can be
> done by setting the minimum and maximum size for a block in Lucene code (
> http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/index/IndexWriterConfig.html#setTermIndexInterval%28int%29
> )
> "For example, Lucene41PostingsFormat
> <http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/codecs/lucene41/Lucene41PostingsFormat.html>
> implements the term index instead based upon how terms share prefixes. To
> configure its parameters (the minimum and maximum size for a block), you
> would instead use Lucene41PostingsFormat.Lucene41PostingsFormat(int, int)
> <http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/codecs/lucene41/Lucene41PostingsFormat.html#Lucene41PostingsFormat%28int,%20int%29>.
> which can also be configured on a per-field basis"
>
> How can we configure Solr to use different (i.e. non-default) mimum and
> maximum block sizes?
>
> Tom
>


Mime
View raw message