lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject RE: Lucene 4.0 scalability and performance.
Date Mon, 24 Dec 2012 09:21:04 GMT
Thank you

-----Original Message-----
From: Steve Rowe [] 
Sent: Sunday, December 23, 2012 8:20 PM
Subject: Re: Lucene 4.0 scalability and performance.

Hi Vitaly,

Anything by Tom Burton-West should interest you - he works on the HathiTrust digital library
project <>, which currently indexes 7TB of full-length books,

"Practical Relevance Ranking for 10 Million Books" (paper) INEX 2012, September 2012, Rome,
Italy <>

"HathiTrust Large Scale Search: Scalability meets Usability" (slides) Code4Lib 2012, February
2012, Seattle, Washington <>

"Large-scale Search" (blog)


On Dec 23, 2012, at 6:11 AM, wrote:

> Hi all,
> We start to evaluate Lucene 4.0 for using in the production environment.
> This means that we need to index millions of document with TeraBytes of content and search
in it.
> For now we want to define only one indexed field, contained the content of the documents,
with possibility to search terms and retrieving the terms offsets.
> Does somebody already tested Lucene with TerabBytes of data?
> Does Lucene has some known limitations related to the indexed documents number or to
the indexed documents size?
> What is about search performance in huge set of data?
> Thanks in advance, Vitaly

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message