lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Split an existing index into smaller segments without a re-index?
Date Thu, 05 Aug 2004 02:20:48 GMT
Kevin A. Burton wrote:
> Is it possible to take an existing index (say 1G) and break it up into a 
> number of smaller indexes (say 10 100M indexes)...
> 
> I don't think theres currently an API for this but its certainly 
> possible (I think).

Yes, it is theoretically possible but not yet implemented.

An easy way to implement it would be to subclass FilterIndexReader to 
return a subset of documents, then use IndexWriter.addIndexes() to write 
out each subset as a new index.  Subsets could be ranges of document 
numbers, and one could use TermPositions.skipTo() to accelerate the 
TermPositions subset implementation, but this still wouldn't be quite as 
fast as an index splitter that only reads each TermPositions once.  If 
we added a lower-level index writing API then one could use that to 
implement this...

Doug



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message