lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DM Smith <>
Subject Re: NioFile cache performance
Date Fri, 09 Dec 2005 17:21:12 GMT
Robert Engels wrote:

>As stated in a previous email - good idea.
>All of the code and testcases were attached to the original email.
>The testcases were the answer to a request for such (at least a month ago if
>not longer).
I am sorry, if I gave you the wrong impression.

I was merely suggesting a formalization of the process and that there be 
documentation on the Lucene website that outlines how performance tests 
and data should be provided, and how people can participate and provide 
their results.

I have seen this issue come up several times (perhaps the following is 
an oversimplification):
Someone will suggest a performance enhancement and perhaps supply the 
code. Then there will be a general discussion about the merits of the 
change and the validity of the results, with question about the factors 
involved and statements regarding how architectures widely differ and 
the outcomes can be significantly different. If enough "voters" like the 
change, then it is committed.

Should there be a representative set of architectures to which 
performance test should be targeted? (For example, I have written an 
application that uses lucene to index and search bibles. And the minimum 
hardware requirement is a Win98 laptop, which many of our user's have.)

>-----Original Message-----
>From: DM Smith []
>Sent: Friday, December 09, 2005 7:07 AM
>Subject: Re: NioFile cache performance
>John Haxby wrote:
>>Robert Engels wrote:
>>>Using a 4mb file (so I could be "guarantee" the disk data would be in
>>>the OS cache as well), the test shows the following results.
>>Which OS?   If it's Linux, what kernel version and distro?   What
>>hardware (disk type, controller etc).
>>It's important to know: I/O (and caching) is very different between
>>Linux 2.4 and 2.6.   The choice of I/O scheduler can also make a
>>significant difference on 2.6, depending on the workload.   The type
>>of disk and its controller is also important -- and when you get
>>really picky, the mobo model number.
>>I don't dispute your finding for a second, but it would be good to run
>>the same test on other platforms to get comparative data: not least
>>because you can get the kind of I/O time improvement you're seeing on
>>some workloads on different versions of the Linux kernel.
>I think that the results were informative from a comparative basis on a
>single machine. It compared different techniques and showed their
>relative performance on that machine.
>I also agree that the architecture of the machine can play an important
>part in how code performs. I wrote a piece of software that ran well on
>a 4-way, massive raid configuration, with gobs of ram only to have it
>re-targeted to a 1-way, small ram box, where it had to be rewritten to
>run at all.
>Perhaps, it would be good to establish guidelines for reporting
>performance, including the posting of test data and test code.
>This may encourage others to download the data and code, perform the
>test and report the results.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message