lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Stoppelman" <>
Subject indexing issue
Date Sat, 29 Nov 2008 17:45:38 GMT
Hi all,

I've got an indexing issue I think other folks might be interested in
hearing about and I wanted to get feedback before I went ahead and
implemented a new method.

Currently, the way we update indices is by sending individual delete/add
document requests to all our search boxes individually. Each box is doing
about 20-30qps while this is happening. The problem I'm seeing is that when
a segment from the index is merged [honestly I don't know that much about
segment merging] (our merge factor is set to 5) and an old highly used
segment of the index is lost from the disk cache; most of the search
requests to that box get prohibitively slow 10-80+ secs and I see pg/in +
pg/out stats spike sar. I'm planning on implementing a method similar to the
SOLR model using the rsync method that Doug Cutting outlined a long time ago
on this list and forcing the new files into the disk cache using fadvice.

Is there another strategy here? Could I create a merge policy that forces
new segments into the disk cache before lucene nukes the old ones?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message