lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter W." <>
Subject Re: lucene scalability questions
Date Thu, 04 Jan 2007 22:02:00 GMT

My understanding of Lucene is limited, but the issues
seem similar to web server farms in that it comes down to
linear scalability by adding more boxes.

This means separate machines with their own indexes.

Shared filesystems such as NFS work well in smaller environments
but experience problems with heavy load (lost mounts req. reboots).

There's no mysql-like 'replication' with masters using
binary files to update slaves. However, since the index is
file based, you can close Indexwriters and make hot copies or
perform backups for redundancy.

If you know XML, use Solr to post and retrieve documents to and from
your various Lucene indexes. It hides the complexity of remote
object brokering such as RMI.

Solr also allows you to get result sets using JSON so you could
provide distributed Lucene results to browsers as a .js widget.

While not reflecting the latest 2.0 version release the Lucene in Action
book provides good background on combining separate indexes.


Peter W.

On Jan 4, 2007, at 7:51 AM, Mark Mei wrote:

> So this question has two parts:
> 1. How does Lucene scale, exactly? Do we distribute the index to  
> multiple
> servers somehow? Or is it one index, sitting on some sort of a shared
> filesystem, shared by all Lucene servers? If it's the latter, the  
> bottleneck
> will be I/O ... anyway, elaborate on scalability please, and how  
> you set it
> up
> 2. High availability. How would one go about making Lucene redundant?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message