lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oren Shir <>
Subject Re: Throughput doesn't increase when using more concurrent threads
Date Tue, 22 Nov 2005 14:33:27 GMT

There are two sunchronization points: on the stream and on the reader. Using
different FSDirectoriy and IndexReaders should solve this. I'll let you know
once I code it. Right now I'm checking if making my Documents store less
data will move the bottleneck to some other place.

Thanks again,
Oren Shir

On 11/21/05, Doug Cutting <> wrote:
> Jay Booth wrote:
> > I had a similar problem with threading, the problem turned out to be
> that in
> > the back end of the FSDirectory class I believe it was, there was a
> > synchronized block on the actual RandomAccessFile resource when reading
> a
> > block of data from it... high-concurrency situations caused threads to
> stack
> > up in front of this synchronized block and our CPU time wound up being
> spent
> > thrashing between blocked threads instead of doing anything useful.
> This is correct. In Lucene, multiple streams per file are created by
> cloning, and all clones of an FSDirectory input stream share a
> RandomAccessFile and must synchronize input from it. MmapDirectory does
> not have this limitation. If your indexes are less than a few GB or you
> are using 64-bit hardware, then MmapDirectory should work well for you.
> Otherwise it would be simple to write an nio-based Directory that does
> not use mmap that is also unsynchronized. Such a contribution would be
> welcome.
> > Making multiple IndexSearchers and FSDirectories didn't help because in
> the
> > back end, lucene consults a singleton HashMap of some kind (don't
> remember
> > implementation) that maintained a single FSDirectory for any given index
> > being accessed from the JVM... multiple calls to
> FSDirectory.getDirectory
> > actually return the same FSDirectory object with synchronization at the
> same
> > point.
> This does not make sense to me. FSDirectory does keep a cache of
> FSDirectory instances, but i/o should not be synchronized on these. One
> should be able to open multiple input streams on the same file from an
> FSDirectory. But this would not be a great solution, since file handle
> limits would soon become a problem.
> Doug
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message