lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Spliting index
Date Fri, 06 Jul 2007 22:17:02 GMT
You can implement a FilterIndexReader that returns only a subset of an 
index.  Then use IndexWriter#addIndexes() to add this to a new, empty 
index.  Do this for each range of terms.

This is somewhat similar to Nutch's IndexSorter:

http://svn.apache.org/viewvc/lucene/nutch/trunk/src/java/org/apache/nutch/indexer/IndexSorter.java?view=markup

Note that IndexWriter#addIndexes() doesn't require that all IndexReader 
methods be implemented.

Doug

Daniel CreĆ£o wrote:
> I'd wanna split my lucene index in smaller segments, each one holding all
> terms starting with the same char.
> 
> I started writing Term's and TermInfo's but i'm worried about others files
> and especially the pointers.
> 
> What care should I have while splitting index?
> 
> - Daniel
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message