Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 62540 invoked from network); 9 Aug 2006 13:44:10 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 9 Aug 2006 13:44:10 -0000 Received: (qmail 23371 invoked by uid 500); 9 Aug 2006 13:44:04 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 23351 invoked by uid 500); 9 Aug 2006 13:44:04 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 23339 invoked by uid 99); 9 Aug 2006 13:44:04 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Aug 2006 06:44:04 -0700 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: 194.52.12.17 is neither permitted nor denied by domain of marcus.falck@observer.se) Received: from [194.52.12.17] (HELO mail1.observergroup.com) (194.52.12.17) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Aug 2006 06:44:02 -0700 Received: (qmail 22816 invoked by uid 509); 9 Aug 2006 15:43:40 +0200 Received: from 10.8.1.19 by mail1 (envelope-from , uid 508) with qmail-scanner-1.25-st-qms (clamdscan: 0.85.1/1049. spamassassin: 3.0.2. perlscan: 1.25-st-qms. Clear:RC:1(10.8.1.19):. Processed in 0.071315 secs); 09 Aug 2006 13:43:40 -0000 X-Antivirus-MYDOMAIN-Mail-From: marcus.falck@observer.se via mail1 X-Antivirus-MYDOMAIN: 1.25-st-qms (Clear:RC:1(10.8.1.19):. Processed in 0.071315 secs Process 22810) Received: from unknown (HELO S1SE1MAIL.emea1.ad.group) (10.8.1.19) by mail1.observergroup.com with SMTP; 9 Aug 2006 15:43:40 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.5.6944.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: SV: Lucene hits.length() Date: Wed, 9 Aug 2006 15:43:28 +0200 Message-ID: <8834A84C87A2C148AD46921BB8BFC97C021E567A@S1SE1MAIL.emea1.ad.group> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Lucene hits.length() Thread-Index: Aca7uJL+8pOHeqXuTZu2pPNLL/ADqQAAR2/Q From: "Marcus Falck" To: X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Still worried =3D) You see it doesn't update the hits.length() in a correct way when I = create a new searcher. The correct update does just occur in the merges. = =3D/ -----Ursprungligt meddelande----- Fr=E5n: Erick Erickson [mailto:erickerickson@gmail.com]=20 Skickat: den 9 augusti 2006 15:34 Till: java-user@lucene.apache.org =C4mne: Re: Lucene hits.length() Then you won't see anything added to your index between times. Does this identify your problem or are you still worried? Erick On 8/9/06, Marcus Falck wrote: > > I'm opening a new searcher every 3:rd minute. > > -----Ursprungligt meddelande----- > Fr=E5n: Erick Erickson [mailto:erickerickson@gmail.com] > Skickat: den 8 augusti 2006 18:58 > Till: java-user@lucene.apache.org > =C4mne: Re: Lucene hits.length() > > I'll take a stab at it.... When are you opening/closing your searcher? > When > you open a searcher, you get a snapshot of the index at that instant, = and > subsequent modifications aren't visible until you open a new searcher = (at > least I think I've got this right). > > And I'm sure this also interacts with the writer merge settings > "interestingly". > > Personally, I'd worry about this a lot more if it happened after I'd > closed > my writer and opened a new reader ... > Of course, my app has an index that is updated rarely (every two = weeks), > so > I haven't dug into too many details in this area... > > > Best > Erick > > On 8/8/06, Marcus Falck wrote: > > > > I have noticed some strange behavior when searching my lucene index. > > > > > > > > I'm adding 500.000 docs to an index. > > > > > > > > MergeFactor =3D 10 > > > > MinMerge =3D 5000 > > > > > > > > When 49999 have been added ( just before the first 10 * 5000 merge ) = the > > hits.length() is reporting around 1000 hits for a keyword (which by = the > > way is around the same count as with 5000 docs added). After the = 10*5000 > > merge the hits.length() returns around 8000 hits, which seems to be = a > > lot more reasonable. Since I'm adding content in date order ( oldest > > first ) I have also tried to sort the hits (newest date first) and > > display the top 10 hits. > > > > > > > > According to that output it seems that the documents are added > > correctly. > > > > > > > > I'm using a multisearcher on top of a RAMDir and an FSDir. Using > > Lucene1.4.3 > > > > > > > > Anybody that has any idea about why the hit count is so misleading? > > > > > > > > / > > > > Regards > > > > Marcus > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org