lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Salikeen, Obaid" <Obaid.Salik...@iacpublishinglabs.com>
Subject How to retrieve 200K documents from Solr 4.10.2
Date Wed, 12 Oct 2016 21:46:11 GMT
Hi,

I am using Solr 4.10.2. I have 200K documents sitting on Solr cluster (it has 3 nodes), and
let me first state that I am new Solr. I want to retrieve all documents from Sold (essentially
just one field from each document).

What is the best way of fetching this much data without overloading Solr cluster?


Approach I tried:
I tried using the following API (running every minute) to fetch a batch of 1000 documents
every minute. On Each run, I initialize start with the new index i.e adding 1000.
http://SOLR_HOST/solr/abc/select?q=*:*&fq=&start=1&rows=1000&fl=url&wt=csv&csv.header=false&hl=false

However, with the above approach, I have two issues:

1.       Solr cluster gets overloaded i.e it slows down

2.       I am not sure if start=X&rows=1000 would give me the correct results (changing
rows=2 or rows=4 gives me totally different results, which is why I am not confident if I
will get the correct results).


Thanks
Obaid


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message