lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor Shalyminov <ishalymi...@yandex-team.ru>
Subject Lucene multithreaded indexing problems
Date Thu, 21 Nov 2013 15:45:05 GMT
Hello!

I tried to perform indexing multithreadedly, with a FixedThreadPool of Callable workers.
The main operation - parsing a single document and addDocument() to the index - is done by
a single worker.
After parsing a document, a lot (really a lot) of Strings appears, and at the end of the worker's
call() all of them goes to the indexWriter.
I use no merging, the resourses are flushed on disk when the segment size limit is reached.

The problem is, after a little while (when the most of the heap memory is used) indexer makes
no progress, and CPU load is constant 100% (no difference if there are 2 threads or 32). So
I think at some point garbage collection takes the whole indexing process down.

Could you please give some advices on the proper concurrent indexing with Lucene?
Can there be "memory leaks" somewhere in the indexWriter? Maybe I must perform some operations
with writer to release unused resourses from time to time?


-- 
Best Regards,
Igor

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message