lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: TestIndexInput test failures on jdk 1.6/linux after r641303
Date Mon, 05 Jan 2009 21:10:33 GMT

In fact I think the 2 test cases of the "modified UTF8 null bytes" are
just bogus, because they are using Java's UTF8 charset decoder to
construct a String when (as Ken points out) the byte sequence 0xC0
0x80 is illegal UTF8.

I'll remove those 2 test cases.

Mike

Mark Miller wrote:

> I think your on the right tack Ken. Don't know enough about Unicode  
> myself, but I was looking at this this morning, and what you say  
> somewhat jives with what I saw.
>
> I don't think you can just flip that switch though - the index  
> format will not match what its trying to read (having been written  
> in the new format). Which is why that couldn't have been intended to  
> test reading the old format, unless it was mistake (it was also  
> added when the format changed, so its not like it was left around).
>
> Possibly a mistake that works when your Unicode support is older?  
> (Im on Ubuntu 8.10 - not sure what that means to my Unicode level)  
> That unicode version comment looks very interesting - no one really  
> noticing this problem in America (thats mentioned it), and I think  
> Sami is in Europe.
>
> McCandless knows whats wrong I sure (he did that patch), but hes  
> either busy fixing it, or ordering another margarita in tijuana.
>
> In either case, I'm sure the issue will be resolved soon.
>
> - Mark
>
> Ken Krugler wrote:
>>> Ok, it's not a java 1.6 thing it's something else. I also found a  
>>> box that runs that test ok.
>>
>> From what I can tell, this is the test that's failing:
>>
>> http://www.krugle.org/kse/entfiles/lucene/apache.org/java/trunk/src/test/org/apache/lucene/index/TestIndexInput.java#89
>>
>> This is verifying that the "Modified UTF-8 null bytes" sequence is  
>> handled properly, from line 63 in the same file.
>>
>> I think this is the old, deprecated format for pre-2.4 indexes.
>>
>> So shouldn't there be a call to setModifiedUTF8StringsMode()? And  
>> since this is a one-way setting of the preUTF8Strings flag, It  
>> feels like this should be in a separate test.
>>
>> Without this call, you'll get the result of calling the String  
>> class's default constructor with an ill-formed UTF-8 sequence (for  
>> Unicode 3.1 or later), since 0xC0 0x80 isn't the shortest form for  
>> the u0000 code point.
>>
>> -- Ken
>>
>>
>>> Mark Miller wrote:
>>>> Hey Sami, I've been running tests quite a bit recently with  
>>>> Ubuntu 8.10  and OpenJDK 6 on a 64-bit machine, and I have not  
>>>> seen it once.
>>>> Just tried again with Sun JDK 6 and 5 32-bit as well, and I am  
>>>> still not seeing it.
>>>>
>>>> Odd.
>>>>
>>>> - Mark
>>>>
>>>> Sami Siren wrote:
>>>>> I am constantly seeing following error when running "ant test":
>>>>>
>>>>>   [junit] Testcase:  
>>>>> testRead(org.apache.lucene.index.TestIndexInput):    FAILED
>>>>>   [junit] expected:<[]> but was:<[??]>
>>>>>   [junit] junit.framework.ComparisonFailure: expected:<[]> but
 
>>>>> was:<[??]>
>>>>>   [junit]     at  
>>>>> org 
>>>>> .apache.lucene.index.TestIndexInput.testRead(TestIndexInput.java: 
>>>>> 89)
>>>>>
>>>>> on both intel and amd architectures running linux.
>>>>>
>>>>> java on AMD:
>>>>> java version "1.6.0_11"
>>>>> Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 11.0-b16, mixed mode)
>>>>>
>>>>> java on Intel:
>>>>> java version "1.6.0_0"
>>>>> IcedTea6 1.4 (fedora-7.b12.fc10-x86_64) Runtime Environment  
>>>>> (build 1.6.0_0-b12)
>>>>> OpenJDK 64-Bit Server VM (build 10.0-b19, mixed mode)
>>>>>
>>>>> java version "1.6.0_11"
>>>>> Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 11.0-b16, mixed mode)
>>>>>
>>>>> java version "1.6.0_11"
>>>>> Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
>>>>> Java HotSpot(TM) Server VM (build 11.0-b16, mixed mode)
>>>>>
>>>>> Anyone else seeing this?
>>>>>
>>>>> -- 
>>>>> Sami Siren
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message