lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Garrett Heaver" <garrett.hea...@researchandmarkets.com>
Subject addIndexes() Size
Date Mon, 06 Dec 2004 10:03:24 GMT
Hi.

 

Its probably really simple to explain this but since I'm not up to speed on
the way Lucene stores the data I'm a little confused.

 

I'm building an Index, which resides on Server A, with the Lucene Service
running on Server B. Now not to bore you with the details but because of the
network transfer rate etc I'm running the actual index on \\ServerA\idx
<file:///\\ServerA\idx>  and building a temp Index at \\ServerB\idx\temp
<file:///\\ServerB\idx\temp>  (obviously because the Local FS is much faster
for the service) and then calling addIndexes to import the temp index to the
ServerA index before destroying the ServerB index, holding for a bit and
then checking for new documents.

 

All works grand BUT the size of the resultant index on ServerA is HUGE in
comparison to one I'd build from start to finish (i.e. a simple addDocument
Index) - 38gig for 220,000 Unstored Items cannot be right (to give you and
idea of how mad this seems, the backed up version of the database from which
the data is pulled is only 2gigs)

 

I've considered it being perhaps the number of Items that had to be
integrated each time addIndexes was called - right now I'm adding around
10,000 at a time (I had done 1000 at a time but this looked like it was
going to end up even larger still)

 

I'm holding off twiddling the minMergeDocs and mergeFactor until I can get a
better understanding of whats going on here.

 

Many thanks for any reply's

Garrett

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message