lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin A. Burton" <>
Subject Re: Real time indexing and distribution to lucene on separate boxes (long)
Date Thu, 11 Mar 2004 21:42:12 GMT
Matalon wrote:

>To clarify how option 3 works:
>You have dira where the search is done and dirb where the indexing is
>done. dirb grows when you add new items to it, and at some point you
>swap and dirb becomes dira, but what do you do then?
The Searcher reloads and points to dira...

>Also, how do you write from the indexer to the directory on the search box?
We rsync the content over...

>2. The index is NFS mounted. The indexer keeps writing to the index, and
>at defined times, creates a NFS snapshot of the index. It then creates
>an entry in a db to let the searcher know that a new snapshot has been
>The searcher checks once a minute the db to see if there's a new
>snapshot. If there is one, it opens the index in the new snapshot and
>swaps it for the old one. The code to do this is synchronized.
>The nice thing about this solution is that you don't have just one copy
>of the index and don't do any copying. But you need to use NFS and
Well... right now I'm thinking that if I can do a merge on the box with 
< 200M per commit that this won't be too much of a burden on the 
searchers as long as it happens at regular intervals. 

Right now though I'm going to have to test this to make sure I can keep 
doing a query and an index merge on the same box with the merge 
happening in a diff process.

Going to send off an email about this in a minute :)



Please reply using PGP.    
    NewsMonster -
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
       AIM/YIM - sfburtonator,  Web -
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
  IRC - #infoanarchy | #p2p-hackers | #newsmonster

View raw message