Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 84350 invoked from network); 22 Mar 2007 13:59:27 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 22 Mar 2007 13:59:27 -0000 Received: (qmail 631 invoked by uid 500); 22 Mar 2007 13:59:27 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 591 invoked by uid 500); 22 Mar 2007 13:59:27 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 580 invoked by uid 99); 22 Mar 2007 13:59:27 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Mar 2007 06:59:27 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of Andreas.Guther@markettools.com designates 72.5.112.150 as permitted sender) Received: from [72.5.112.150] (HELO mvmail04.markettools.com) (72.5.112.150) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Mar 2007 06:59:17 -0700 Received: from WDCCPMAIL01.markettools.com ([10.64.64.33]) by mvmail04.markettools.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 22 Mar 2007 06:58:55 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: Speeding up looping over Hits Date: Thu, 22 Mar 2007 06:58:54 -0700 Message-ID: <3FB08D6A21B3EC4D8749EF6D9E626278010466E3@WDCCPMAIL01.markettools.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Speeding up looping over Hits Thread-Index: Acdsijptxu1YDdM7SNW3qbcGwRElaA== From: "Andreas Guther" To: X-OriginalArrivalTime: 22 Mar 2007 13:58:55.0296 (UTC) FILETIME=[3AE8A400:01C76C8A] X-Virus-Checked: Checked by ClamAV on apache.org Hi, While looking into performance enhancement for our search feature I noticed a significant difference in Documents access time while looping over Hits. I wrote a test application search for a list of search terms and then for each returned Hits object loops twice over every single hits.doc(i). for (int i =3D 0; i < numberOfDocs; i++) {doc =3D hits.doc(i);} I am seeing differences like the following Found 16,215 hits for 'Water or Wine' in 219 ms Processed 16,215 docs in 53,141 ms; per single doc 3.2773 ms Processed 16,215 docs in 2,032 ms; per single doc 0.1253 ms Interestingly if I run the same test application a second time in my IDE the difference between the first and the second loop is very low. I have no explanation why I see this difference but it becomes a huge problem for us due to the fact that I need to extract from each document a small set of information pieces and the first time looping just takes too much time. I could not find any indication for an external caching of Hits. I am running my tests within Eclipse with a memory setting of -Xms766M -Xmx1024M. What is the explanation in the different access speed for the same search results? Is there a way to speed up looping over the Hits data structure? Andreas --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org