Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2E90517CDD for ; Sat, 30 May 2015 14:08:16 +0000 (UTC) Received: (qmail 6123 invoked by uid 500); 30 May 2015 14:08:14 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 6066 invoked by uid 500); 30 May 2015 14:08:14 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 6053 invoked by uid 99); 30 May 2015 14:08:14 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 May 2015 14:08:14 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id C5276C9F49 for ; Sat, 30 May 2015 14:08:13 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.669 X-Spam-Level: X-Spam-Status: No, score=0.669 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=yahoo.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id AIdh8YLpJH5Q for ; Sat, 30 May 2015 14:08:08 +0000 (UTC) Received: from nm22-vm2.bullet.mail.ne1.yahoo.com (nm22-vm2.bullet.mail.ne1.yahoo.com [98.138.91.210]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id B5D8E47C05 for ; Sat, 30 May 2015 14:08:07 +0000 (UTC) Received: from [98.138.100.118] by nm22.bullet.mail.ne1.yahoo.com with NNFMP; 30 May 2015 14:06:58 -0000 Received: from [98.138.89.251] by tm109.bullet.mail.ne1.yahoo.com with NNFMP; 30 May 2015 14:06:58 -0000 Received: from [127.0.0.1] by omp1043.mail.ne1.yahoo.com with NNFMP; 30 May 2015 14:06:58 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 402889.56950.bm@omp1043.mail.ne1.yahoo.com X-YMail-OSG: Yh5zLKkVM1mjS3WI1UdvEdRxNHFDcJEcF.B7urau_I2okuz1HP37FCsljFxdB5c m7UZWJb8mP9D9XppFBXYv_nMiYFI.SJI5eU6ZeoLy7ikh7RHHMQQo5ClPAoFUId3q8zqpT9M6JZi ACKTPQNBVFlY.oKc9jzNfVAuLqJxX9pfMItBSgPVIRqICNeCB2w7g3cWKP7nofKHsefhYxJcDru. LZn6e_UnrtMZTFfxALRA.Lsups0.rlNIaid7K7ERLXzn_NXnYRVEE6V1C4qE_nmNNVzEJK4ufPn2 1pkPvmDLChwS_FT3y9_O57p0v304wDkUUVWggbnFScmV10OUj6hmU9DTJhN1sPvTL6RwUAR1_uqM H2B0I5q47sgW73AuMV_LHufLhl.Du8r5HKBkXxFFpWG.bP9XGaoL.5jBEszGmGtcIMdUhVJw9qgj LG5b4vmRIn7UVZnHC7antNfa7E3v4ccxIpro552lS.w6s9e9PPQ8T4Sm87dYVMfGEvdBZRQ0Od81 jvW0I_544Yu_3 Received: by 98.138.105.199; Sat, 30 May 2015 14:06:57 +0000 Date: Sat, 30 May 2015 14:06:47 +0000 (UTC) From: Ahmet Arslan Reply-To: Ahmet Arslan To: "java-user@lucene.apache.org" Message-ID: <1622199027.1689311.1432994807197.JavaMail.yahoo@mail.yahoo.com> In-Reply-To: References: Subject: Re: IllegalArgumentException: docID must be >= 0 and < maxDoc=48736112 (got docID=2147483647) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Hi Robert, Great info. I prevented corner cases in similarities that were producing NaN or Negative Infinity scores. All is well with -ea now. Thanks, Ahmet On Friday, May 29, 2015 3:32 PM, Robert Muir wrote: Hi Ahmet, Its due to the use of sentinel values by your collector in its priority queue by default. TopScoreDocCollector warns about this, and if you turn on assertions (-ea) you will hit them in your tests: *

NOTE: The values {@link Float#NaN} and * {@link Float#NEGATIVE_INFINITY} are not valid scores. This * collector will not properly collect hits with such * scores. */ public abstract class TopScoreDocCollector extends TopDocsCollector { I don't think a fix is simple, I only know of the following ideas: * somehow sneaky use of NaN as sentinels instead of -Inf, to allow -Inf to be used. It seems a bit scary! * remove the sentinels optimization. I am not sure if collectors could easily have the same performance without them. To me, such scores seem always undesirable and only bugs, and the current assertions are a good tradeoff. On Fri, May 29, 2015 at 8:18 AM, Ahmet Arslan wrote: > Hello List, > > When a similarity returns NEGATIVE_INFINITY, hits[i].doc becomes 2147483647. > Thus, exception is thrown in the following code: > > for (int i = 0; i < hits.length; i++) { > int docId = hits[i].doc; > Document doc = searcher.doc(docId); > } > > I know it is an awkward to return infinity (comes from log(0)), but exception looks like equally > awkward and uniformative. > > Do you think is this something improvable? Can we do better handling here? > > Thanks, > Ahmet > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org