lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Scaling Issues
Date Tue, 29 Jul 2014 20:10:26 GMT
Make sure it isn't doing a Solr commit on each document.

Is it slow immediately, like on the first 100 documents, or only after 
awhile?

When you do see it indexing very slow, check the size of the Solr index - 
you should make sure that you have enough system memory available for file 
caching to hold the entire Solr index.

Do you have Solr auto-commit enabled?

-- Jack Krupansky

-----Original Message----- 
From: Ameya Aware
Sent: Tuesday, July 29, 2014 3:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Scaling Issues

I am using Apache ManifoldCF framework which connects to my local system
and passes all the documents in C drive to Solr.

I am not doing any searches while indexing.

There is total 362GB of data needs to be indexed. I am not performing any
complex analysis.

Thanks,
Ameya




On Tue, Jul 29, 2014 at 2:49 PM, Toke Eskildsen <te@statsbiblioteket.dk>
wrote:

> Ameya Aware [ameya.aware@gmail.com] wrote:
>
> [Solr -Xmx5120m]
>
> > I need to index around 300000 documents but with above parameters
> > performance is coming very poor around 15000-20000 documents per hour.
>
> 4-5 documents/second is a lot less than the numbers people normally cite,
> but we need to know more about what you are doing in order to help.
>
> One common reason for unexpected slow indexing is slow data extraction.
> Where does your data come from and is is possible to perform a run where
> you do not index but just extract, and measure how long that takes?
>
> Is your index being used for searches while indexing? If so, how many
> searches/second?
>
> How large are the documents you index? How large is your total index? Do
> you perform any complex analysis as part of the indexing?
>
> - Toke Eskildsen
> 


Mime
View raw message