lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: Indexing from multiple applications to a central index.
Date Thu, 09 Jun 2005 20:27:53 GMT
I think your setup is right for a centralized IndexQueueManager that is
subscribed to topics to which your distributed servers push data to
index via JMS.  That way you get an easy way to add more machines to
the cluster, you get persistence of not-yet-indexed data, and you get a
queuing mechanism that takes care of locking issues.


--- Doug Hughes <> wrote:

> Hello,
> I have a situation where I need to have multiple applications,
> potentially
> located on different servers, and which have no knowledge of each
> other,
> indexing into and searching from the same Lucene index.  I anticipate
> problems with locks.  
> Let's say I have two applications and, at any time, either of them
> may try
> to index upwards of 1000 documents (or more!).  If, by luck, these
> applications do not attempt to write to the index at the same time
> then
> things are fine.  However, if both of them try to write to the index
> at the
> same time, one of them will fail due to the index being locked.  
> My first solution to this problem was to have both applications check
> to see
> if the index is locked and to let them sleep until the index was
> unlocked.
> The problem with this is that if, while indexing, an application is
> shut
> down or killed, the index may not be unlocked.  This will block other
> applications from indexing and may cause them to hang.
> Clearly I have a threading problems.  I think I may know a solution
> to this
> problem and I would appreciate verification of the solution or
> suggestions
> on approaches.
> I am thinking that I can make all of the applications index into
> their own
> index, not the central shared index.  Their own index might be a
> FSDirectory
> or a RAMDirectory.  When done indexing, the applications' indexes
> would be
> merged with the central index for consumption by all applications
> sharing
> the index.
> From what I understand, the process of merging indexes takes a lot
> less time
> than the process of inserting into or deleting from an index.  This
> seems to
> mean that I'm less "likely" to run into locking issues.  I can more
> safely
> have process sleep until the index is unlocked and can gain access to
> merge
> their index with the central index.  If these applications use their
> own
> FSDirectory I should be able to continue working with their FS
> directory in
> the case of an unclean shutdowns and should still be able to merge it
> with
> the central index.
> Does anyone have any advice to offer on this? 
> Thank you,
> Doug Hughes

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message