lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Kor" <dave....@nexusedge.com>
Subject RE: lucene performance question
Date Mon, 03 Mar 2003 07:47:23 GMT
The first query is always slow because it includes time taken to load the
index. Index loading time is a function of archive size, meaning the larger
the archive the longer the load time. However search time is more a function
of number of search terms, meaning if your archive only contains 100 unique
terms regardless of number of files, search time won't vary much as your
archive size increases 1000 to 16000 files.


Dave Kor Kian Wei
Consultant
Product Engineering
NexusEdge Technologies Pte. Ltd.
6 Aljunied Ave 3, #01-02 (Level 4)
Singapore 389932
Tel : (+65)848-2552
Fax : (+65)747-4536
Web : www.nexusedge.com

> -----Original Message-----
> From: Harry Foxwell [mailto:hfoxwell@cox.net]
> Sent: Sunday, March 02, 2003 10:49 AM
> To: Lucene Users List
> Subject: lucene performance question
>
>
> I have a project for which I want to characterize Lucene query performance
> on different size archives of my XML files.  I have created archives
> and indices of 1000, 2000, 4000, 8000, and 16000 XML files (average
> file size about 10K) generated from
> my DTD and containing mostly random string content in the simple
> elements.  I run multiple tests with different random content in
> each in the archive, timing each of three diffenent queries:
>
>    query 1: Field1:stringA
>    query 2: Field1:stringA Field2:stringB
>    query 3: Field1:stringA AND Field2:stringB
>
> the time to complete query 1 increases with archive size, but the
> subsequent query 2 and query 3 times are ALL about the same
> (generally less than 1 sec, on a Sun Ultra 60 with 2 450 MHz
> processors & 512 MB memory, running Solaris 9, Java 1.4,
> Lucene 1.2) regardless of archive size.
>
> I expected the time to complete query 2 and 3 to also increase
> with archive size, but as I said it remained constant.  What
> is Lucene doing (caching?) to make this happen?
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message