lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sudarsan, Sithu D." <Sithu.Sudar...@fda.hhs.gov>
Subject RE: metrics for index ~100M docs
Date Thu, 24 Sep 2009 17:11:21 GMT
 
Hi Joel,

With approx. 100K doc size, on dual-quad core machine, (3.0Ghz) -
Windows platform, we have an average 1000 docs/sec. This includes text
extraction from PDF docs. 

Hope this helps.

Sincerely,
Sithu D Sudarsan


-----Original Message-----
From: Joel Halbert [mailto:joel@su3analytics.com] 
Sent: Thursday, September 24, 2009 11:17 AM
To: Lucene Users
Subject: metrics for index ~100M docs

Hi,

Does anyone know of any recent metrics & stats on building out an index
of  ~100mm documents (each doc approx 5k). I'm looking for approx stats
on time to build, time to query and infrastructure requirements (number
of machines & spec) to reasonably support an index of such a size.

Thanks, 
Joel


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message