lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin A. Burton" <>
Subject Re: Lucene Search has poor cpu utilization on a 4-CPU machine
Date Tue, 13 Jul 2004 01:25:39 GMT
Doug Cutting wrote:

> I noticed that the class org.apache.lucene.index.FieldInfos uses private
>> class members Vector byNumber and Hashtable byName, both of which are
>> synchronized objects. By changing the Vector byNumber to ArrayList 
>> byNumber
>> I was able to get 110% improvement in performance (number of searches 
>> per
>> second).
> That's impressive! Good job finding a bottleneck!

Wow... thats awesome.

We have all dual XEONs with Hyperthreading and kernel 2.6 so I imagine 
in this situation we'd see an improvement too.

I wonder if we could break this out into a patch for legacy Lucene 
users. I'd like to see the stacktrace too.

We're using a lot of synchronized code (Hashtable, Vector, etc) so I'm 
willing to bet this is happening in other places.

>> My question is: do the fields byNumber and byName have to be 
>> synchronized
>> and what can happen if I'll change them to be ArrayList and HashMap 
>> which
>> are not synchronized ? Can this corrupt the index or the integrity of 
>> the
>> results?
> I think that is a safe change. FieldInfos is only modifed by 
> DocumentWriter and SegmentMerger, and there is no possibility of other 
> threads accessing those instances. Please submit a patch to the 
> developer mailing list.
That would be great!



Please reply using PGP.    
    NewsMonster -
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
       AIM/YIM - sfburtonator,  Web -
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
  IRC - #infoanarchy | #p2p-hackers | #newsmonster

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message