lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Lucene Sizing Metrics
Date Fri, 29 Mar 2013 09:12:00 GMT
It all depends on your data and your policies.

That much data is not a good fit for a single machine, but is quite
plausible for SolrCloud.

I would recommend that you run some experiments with different trade-offs.
 It is common for a Lucene index to be a fraction of the size of the
original text which would mean that your final index would be several
terabytes which might require dozens to hundreds of instances to
effectively search and/or maintain.  The error bars on such an estimate,
however, are huge and you should test it for yourself.

On Thu, Mar 28, 2013 at 7:09 PM, hurtlingturtle <> wrote:

> Hi,
> Are there any sizing metrics available for Lucene indexes?  I am unclear on
> how this would scale up.  I am considering what indexing technology to use
> to index many hundreds of terabytes of documents and email content to
> enable
> searching of that content for keywords and phrases and also ensuring that
> the results are security trimmed.
> My concerns are around the following sizing details...
> Cores
> Servers
> Disks
> Shards
> and anything else you think would be relevant :)
> thanks
> hurtlingturtle
> --
> View this message in context:
> Sent from the Lucene - General mailing list archive at

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message