lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Pfeifer" <>
Subject A couple of questions regarding load balancing and failover
Date Wed, 30 Nov 2005 15:22:27 GMT

I am working for a major Application Service Provider in Europe and we
have now since a couple of months very successfully used Lucene 1.4. We
are overall very pleased with it but as the load on the application
which uses Lucene increased we were forced to invest in better hardware
and also in redundancy.

Since I am not 100% sure if everything is implemented as it should I
would like to ask you all to answer a couple of questions. First
however, I want to explain how our architecture currently looks like:

We are running a very Service-Oriented Architecture and thus lots of our
applications use JINI and RMI services. We currently got four main
applications which use Lucene and these applications connect to our
Lucene-index by RMI. We've got two Lucene Servers and both access the
same index-files which are placed on a shared drive. These two servers
simply expose the indexes by a RemoteSearchable and all applications
which use Lucene simply connect to these RemoteSearchables via RMI.
Also, we have another server which does nothing but update the

Now my questions:

1.) Does Lucenes MultiSearcher implement some kind of automatic failover
and/or load-balancing mechanism if both Searchables which I supply in
MultiSearchers constructor go to two different servers but to the very
same index-files? I.e. if server 1 crashes there is still server 2 and
thus at least one server will be able to complete the request. Will the
MultiSearcher which is using RemoteSearchables from two servers
automatically detect that Searchable number 1 (server 1) does not
respond and then try Searchable 2? If not, what is the recommended way
of doing this. And the second part of the question is: If both
Searchables are available and working, will the MultiSearcher
automatically distribute requests to both Searchables or is there a risk
that we get duplicates since both Searchables actually expose the same
indexes? If this isn't the case, what would be the recommended way of
implemented load distribution over several servers.

2.) On our index-servers which expose the underlaying index as a
RemoteSearchable we do have four dualcore processors each. Since we thus
have great multithreading-capabilities I do use the
ParallelMultiSearcher instead of the MultiSearcher. On the client side
(the application which connects to the Index RMI-Server), should I
therefore also be using a ParallelMultiSearcher or is it ok if I use the
standard MultiSeacher? And if so, why?

3.) Currently - to increase speed - we are loading the entire index into
memory (using RAMDirectory rather than the FSDirectory). We found out
that the RAMDirectory will not update itself if the files in the
directory from where the RAMDirectory is loading the index are updated.
Therefore I simply coded a Thread which every 10 minutes instantiates
new RAMDirectories, unbinds the current RemoteSearchable and then
rebinds to the RMI Registry with the new Searchable which uses the new
RAMDirectories. This certainly doesn't feel like a good solution, even
though the time under which the RMI service will not be able to answer
is minimal, there is still a small chance that this very moment a client
application tries to find something in the index. Is there a way to
refresh the RAMDirectory without having to create new instances of all
classes and bind this classes to the RMI Registry? If so, how?

I would be enormously thankful if you guys could answer my questions as
our load is increasing daily and we would like to have our Lucene-index
working as smoothly as possible!

Daniel Pfeifer

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message