Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 81597 invoked from network); 13 Nov 2009 23:02:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Nov 2009 23:02:03 -0000 Received: (qmail 5224 invoked by uid 500); 13 Nov 2009 23:02:01 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 5146 invoked by uid 500); 13 Nov 2009 23:02:01 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 5090 invoked by uid 99); 13 Nov 2009 23:02:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Nov 2009 23:02:01 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [85.25.71.29] (HELO mail.troja.net) (85.25.71.29) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Nov 2009 23:01:50 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.troja.net (Postfix) with ESMTP id 45F44D36005 for ; Sat, 14 Nov 2009 00:01:30 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mail.troja.net Received: from mail.troja.net ([127.0.0.1]) by localhost (megaira.troja.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2X5c4cc+fbVF for ; Sat, 14 Nov 2009 00:01:16 +0100 (CET) Received: from VEGA (port-83-236-62-54.dynamic.qsc.de [83.236.62.54]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.troja.net (Postfix) with ESMTPSA id 22C1ED36002 for ; Sat, 14 Nov 2009 00:01:16 +0100 (CET) From: "Uwe Schindler" To: References: Subject: RE: Redundant fields Token class? Date: Sat, 14 Nov 2009 00:01:11 +0100 Message-ID: <07A3E7ACBA7D405A91DE4F39079A1BF0@VEGA> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: Acpks8T4VfduTMYMT/S3gl/bSpsy6QAARgaA In-Reply-To: X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 X-Virus-Checked: Checked by ClamAV on apache.org This is not coupled because: termLength() is the number of chars in the term buffer, where the offsets give the offsets in the orginal char stream. If you use a CharFilter to e.g. remove chars, the termLength will get shorter, but the offset are still the original ones. Also both things are indexed in different ways, the termLength and offsets have no relation and must (as said before) not even follow a contract like end-start=length. ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: uwe@thetaphi.de > -----Original Message----- > From: Babak Farhang [mailto:farhang@gmail.com] > Sent: Friday, November 13, 2009 11:50 PM > To: java-user@lucene.apache.org > Subject: Redundant fields Token class? > > I'm writing a TokenFilter and am confused about why class Token has > both an *endOffset* and a *termLength* field. It would appear that > the following invariant should always hold for a Token instance: > > termLength() == endOffset() - startOffset() > > If so, then > > 1) Why 2 fields, instead of 1? > 2) Why isn't the invariant enforced in the class? > > -Babak > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org