lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Details on setting block parameters for Lucene41PostingsFormat
Date Tue, 13 Jan 2015 19:20:06 GMT
: 
: The first int to Lucene41PostingsFormat is the min block size (default
: 25) and the second is the max (default 48) for the block tree terms
: dict.

we were discussing over on the solr-user mailing list how Tom would/could 
go about configuring Solr to use a custom subclass of 
Lucene41PostingsFormat where he overrode those min/max constructor params, 
but i realized i have no idea how he's suppose to leverage the plumbing in 
PostingFormat to override the "name" of the format so it's used properly 
in SPI.

Lucene41PostingsFormat's constructor options only allow overriding the 
block sizes, not the "name" that gets propogated up to the PostingFormat() 
constructor ... so what is the expected way to write a subclass?


: On Fri, Jan 9, 2015 at 4:15 PM, Tom Burton-West <tburtonw@umich.edu> wrote:
: > Hello all,
: >
: > We have over 3 billion unique terms in our indexes and with Solr 3.x we set
: > the TermIndexInterval to about 8 times its default value in order to index
: > without OOMs.  (
: > http://www.hathitrust.org/blogs/large-scale-search/too-many-words-again)
: >
: > We are now working with Solr 4 and running into memory issues and are
: > wondering if we need to do something analogous for Solr 4.
: >
: > The javadoc for IndexWriterConfig (
: > http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/index/IndexWriterConfig.html#setTermIndexInterval%28int%29
: > )
: > indicates that the lucene 4.1 postings format has some parameters which may
: > be set:
: > "..To configure its parameters (the minimum and maximum size for a block),
: > you would instead use Lucene41PostingsFormat.Lucene41PostingsFormat(int,
: > int)
: > <https://lucene.apache.org/core/4_10_2/core/org/apache/lucene/codecs/lucene41/Lucene41PostingsFormat.html#Lucene41PostingsFormat%28int,%20int%29>
: > "
: >
: > Is there documentation or discussion somewhere about how to determine
: > appropriate parameters or some detail about what setting the maxBlockSize
: > and minBlockSize does?
: >
: > Tom Burton-West
: > http://www.hathitrust.org/blogs/large-scale-search
: 
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-user-help@lucene.apache.org
: 
: 

-Hoss
http://www.lucidworks.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message