lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <>
Subject Re: How fast indexing?
Date Tue, 22 Mar 2016 02:14:15 GMT
On 3/21/2016 7:48 PM, Amit Jha wrote:
> When I run the same sql on DB it takes only 1 sec. And 6-7 documents are getting indexed
per second. 

That's really slow.  It seems likely that you are having extreme
performance issues due to garbage collection problems, possibly from a
heap that needs to be larger.  I will need a lot more information about
your hardware/Solr setup to figure anything out.  Some info that might
be useful:

* Solr version.
* RAM installed in each machine.
* The max heap size on each machine.
* The amount of index data contained on each machine.
* How many Solr documents live on each machine.
* Anything else you can think of that might be helpful.

> As I've 4 node solrCloud setup, can I run 4 import handler to index the same data? Will
it not over write?

DIH is generally not the best way to index to SolrCloud.  The DIH
feature was created *long* before SolrCloud ever existed -- it was
designed for single-core indexes.  The best option for indexing to
SolrCloud is a SolrJ program using CloudSolrClient, or another program
that can create indexing requests you can send to the /update handler,
ideally having multiple requests in parallel.

> 10-20k is very high in numbers, where can I get the actual size of document.

You'd need to check your database, add up the sizes of all the columns
that Solr is indexing for a typical document.


View raw message