lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Java Programmer <>
Subject Speedup indexing process
Date Fri, 17 Feb 2006 10:17:51 GMT
Maybe this question is trivial but I need to ask it. I've some problem with
indexing large number of documents, and I seek for better solution.
Task is to index about 33GB text data CSV (each record about 30kB), it
possible of course to index these data but I'm not very happy with timings
(about 26 hours), so I want to know how can i speed up this process. First I
think about splitting CVS file into smaller ones, eg 5GB and index them on 6
indexing computers, but now is my question - can I join such parts into one
index after indexing jobs on each computer is finished? I saw example wit
RAMDirectory which could be merged with
FSDirectory, but this example was about same IndexWriter, in my case I need
some separate IndexWriters on few computers. So does it possible with

Thx in advance for hints,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message