Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 21234 invoked from network); 19 Nov 2004 13:29:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 19 Nov 2004 13:29:19 -0000 Received: (qmail 91795 invoked by uid 500); 19 Nov 2004 13:29:15 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 91471 invoked by uid 500); 19 Nov 2004 13:29:13 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 91454 invoked by uid 99); 19 Nov 2004 13:29:13 -0000 X-ASF-Spam-Status: No, hits=1.1 required=10.0 tests=PLING_QUERY X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from [213.61.178.43] (HELO mail.tanto.de) (213.61.178.43) by apache.org (qpsmtpd/0.28) with ESMTP; Fri, 19 Nov 2004 05:29:11 -0800 Received: from localhost (localhost [127.0.0.1]) by mail.tanto.de (Postfix) with ESMTP id 4534B23BCA for ; Fri, 19 Nov 2004 14:29:46 +0100 (CET) Received: from mail.tanto.de ([127.0.0.1]) by localhost (mail.tanto.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 17160-10 for ; Fri, 19 Nov 2004 14:29:46 +0100 (CET) Received: from tucholsky.office.tanto.de (unknown [10.0.1.4]) by mail.tanto.de (Postfix) with ESMTP id 20E5623BC8 for ; Fri, 19 Nov 2004 14:29:46 +0100 (CET) Received: from tucholsky.office.tanto.de (morus@tucholsky [127.0.0.1]) by tucholsky.office.tanto.de (8.12.3/8.12.3/Debian-6.6) with ESMTP id iAJDSumA010178 for ; Fri, 19 Nov 2004 14:28:56 +0100 Received: (from morus@localhost) by tucholsky.office.tanto.de (8.12.3/8.12.3/Debian-6.6) id iAJDSukJ010174; Fri, 19 Nov 2004 14:28:56 +0100 From: Morus Walter MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16797.62744.571636.425134@tanto-xipolis.de> Date: Fri, 19 Nov 2004 14:28:56 +0100 To: "Lucene Users List" Subject: Re: WildcardTermEnum skipping terms containing numbers?! In-Reply-To: <20041119124518.72029.qmail@web21325.mail.yahoo.com> References: <16796.25952.23037.960288@tanto-xipolis.de> <20041119124518.72029.qmail@web21325.mail.yahoo.com> X-Mailer: VM 7.03 under 21.4 (patch 6) "Common Lisp" XEmacs Lucid X-Virus-Scanned: by amavisd-new at mail.tanto.de Q&A postmaster@tanto.de X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Sanyi writes: > > If there's a bug, it should be tracked down, not worked around... > > Sure, but I'm working with 20million records and it takes about 25 hours to re-index, so I'm > looking for ways that doesn't require reindexing. > why reindex? > My code was: > > WildcardTermEnum wcenum = new WildcardTermEnum(reader, term); > > while (wcenum.next()) { > terms.add(new WeightedTerm(termgroup,wcenum.term().text())); > //System.out.println(wcenum.term().text()); > } > > And it skipped lots of things it shouldn't have skipped. As stated at the end of my mail, I'd expect that to skip the first term in the enum. Is that, what you miss or do you loose more than one term? Morus --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org