lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raymond Xie <>
Subject How to do multi-threading indexing on huge volume of JSON files?
Date Wed, 09 May 2018 03:25:36 GMT
I have a huge amount of JSON files to be indexed in Solr, it costs me 22
minutes to index 300,000 JSON files which were generated from 1 single bz2
file, this is only 0.25% of the total amount of data from the same business
flow, there are 100+ business flow to be index'ed.

I absolutely need a good solution on this, at the moment I use the post.jar
to work on folder and I am running the post.jar in single thread.

I wonder what is the best practice to do multi-threading indexing? Can
anyone provide detailed example?

*Sincerely yours,*


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message