lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Solr server requirements for 100+ million documents
Date Fri, 24 Jan 2014 20:58:42 GMT
Can't be done with the information you provided, and can only
be guessed at even with more comprehensive information.

Here's why:

http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

Also, at a guess, your indexing speed is so slow due to data
acquisition; I rather doubt
you're being limited by raw Solr indexing. If you're using SolrJ, try
commenting out the
server.add() bit and running again. My guess is that your indexing
speed will be almost
unchanged, in which case it's the data acquisition process is where
you should concentrate
efforts. As a comparison, I can index 11M Wikipedia docs on my laptop
in 45 minutes without
any attempts at parallelization.


Best,
Erick

On Fri, Jan 24, 2014 at 12:10 PM, Susheel Kumar
<susheel.kumar@thedigitalgroup.net> wrote:
> Hi,
>
> Currently we are indexing 10 million document from database (10 db data entities) &
index size is around 8 GB on windows virtual box. Indexing in one shot taking 12+ hours while
indexing parallel in separate cores & merging them together taking 4+ hours.
>
> We are looking to scale to 100+ million documents and looking for recommendation on servers
requirements on below parameters for a Production environment. There can be 200+ users performing
search same time.
>
> No of physical servers (considering solr cloud)
> Memory requirement
> Processor requirement (# cores)
> Linux as OS oppose to windows
>
> Thanks in advance.
> Susheel
>

Mime
View raw message