lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Spencer <David.Spen...@micromuse.com>
Subject Re: lucene performance question
Date Mon, 03 Mar 2003 18:26:56 GMT
Is it possible that there's some combo of:
- the index of your data set being small relative to the Solaris disk 
cache/RAM
- stringA being rare

such that it would explain some of your results?


Harry Foxwell wrote:

> I have a project for which I want to characterize Lucene query 
> performance
> on different size archives of my XML files.  I have created archives
> and indices of 1000, 2000, 4000, 8000, and 16000 XML files (average
> file size about 10K) generated from
> my DTD and containing mostly random string content in the simple
> elements.  I run multiple tests with different random content in
> each in the archive, timing each of three diffenent queries:
>
>   query 1: Field1:stringA
>   query 2: Field1:stringA Field2:stringB
>   query 3: Field1:stringA AND Field2:stringB
>
> the time to complete query 1 increases with archive size, but the
> subsequent query 2 and query 3 times are ALL about the same
> (generally less than 1 sec, on a Sun Ultra 60 with 2 450 MHz
> processors & 512 MB memory, running Solaris 9, Java 1.4,
> Lucene 1.2) regardless of archive size.
>
> I expected the time to complete query 2 and 3 to also increase
> with archive size, but as I said it remained constant.  What
> is Lucene doing (caching?) to make this happen?
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message