Return-Path: Delivered-To: apmail-lucene-commits-archive@www.apache.org Received: (qmail 61342 invoked from network); 26 Jul 2010 19:32:27 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 26 Jul 2010 19:32:27 -0000 Received: (qmail 37063 invoked by uid 500); 26 Jul 2010 19:32:27 -0000 Mailing-List: contact commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list commits@lucene.apache.org Received: (qmail 37056 invoked by uid 99); 26 Jul 2010 19:32:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Jul 2010 19:32:27 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Jul 2010 19:32:26 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id A00E523889E0; Mon, 26 Jul 2010 19:31:34 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r979415 - /lucene/dev/branches/preflexfixes/lucene/src/java/org/apache/lucene/index/codecs/preflex/PreFlexFields.java Date: Mon, 26 Jul 2010 19:31:34 -0000 To: commits@lucene.apache.org From: mikemccand@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20100726193134.A00E523889E0@eris.apache.org> Author: mikemccand Date: Mon Jul 26 19:31:34 2010 New Revision: 979415 URL: http://svn.apache.org/viewvc?rev=979415&view=rev Log: LUCENE-2554: add comment explaining why we can't assert valid UTF8 when dancing Modified: lucene/dev/branches/preflexfixes/lucene/src/java/org/apache/lucene/index/codecs/preflex/PreFlexFields.java Modified: lucene/dev/branches/preflexfixes/lucene/src/java/org/apache/lucene/index/codecs/preflex/PreFlexFields.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/preflexfixes/lucene/src/java/org/apache/lucene/index/codecs/preflex/PreFlexFields.java?rev=979415&r1=979414&r2=979415&view=diff ============================================================================== --- lucene/dev/branches/preflexfixes/lucene/src/java/org/apache/lucene/index/codecs/preflex/PreFlexFields.java (original) +++ lucene/dev/branches/preflexfixes/lucene/src/java/org/apache/lucene/index/codecs/preflex/PreFlexFields.java Mon Jul 26 19:31:34 2010 @@ -290,9 +290,10 @@ public class PreFlexFields extends Field // unicode character: assert isHighBMPChar(term.bytes, pos); - // TODO: understand why this assert sometimes (rarely) - // trips! - // assert term.length >= pos + 3: "term.length=" + term.length + " pos+3=" + (pos+3); + // NOTE: we cannot make this assert, because + // AutomatonQuery legitimately sends us malformed UTF8 + // (eg the UTF8 bytes with just 0xee) + // assert term.length >= pos + 3: "term.length=" + term.length + " pos+3=" + (pos+3) + " byte=" + Integer.toHexString(term.bytes[pos]) + " term=" + term.toString(); // Save the bytes && length, since we need to // restore this if seek "back" finds no matching