lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <DOR...@il.ibm.com>
Subject Re: potential indexing perormance improvement for compound index - cut IO - have more files though
Date Sun, 17 Dec 2006 07:20:33 GMT
Doug Cutting wrote:
> Doug Cutting wrote:
> > Yes.  On 32-bit systems with indexes larger than 1GB or so, memory
> > mapping is impractical, so synchronization is required around shared
> > file handles (using Java's classic i/o APIs, w/o pread).  The
> > non-compound format, with more files, has fewer synchronization
> > bottlenecks.  One could of course achieve the same improvements in
other
> > ways, e.g., by pooling multiple IndexReaders per index, but in straight

> > A-to-B comparisons, folks see better throughput with non-compound
> > indexes for multi-threaded applications.
>
> On second thought, a good fix for this might be to simply convert
> FSDirectory to use nio's pread support, eliminating file handle
> synchronization even when mmap isn't used.

Comparing the two for a small index (100,000 docs of the Reuters
collection, index size 170MB) showed no evident search performance
advantage for non-compound. For 300 parallel searches with traversing of
docs compound was faster. But this is a small index, not in the 1GB range,
and search was fast anyhow.

I think it would make sense to first verify the advantage of nio over io in
this multi-reading scenario with a synthetic scenario.

Also, if nio proves to be faster in this scenario, it might make sense to
keep current FSDirectory, and just add FSDirectoryNio implementation.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message