lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven A Rowe <>
Subject RE: [ANNOUNCE] Lucene Java 2.4.1 released
Date Mon, 09 Mar 2009 19:49:38 GMT
Hi Mike,

On 3/9/2009 at 2:34 PM, Michael McCandless wrote:
> See changes at

Minor nit: the encoding of Christian Kohlschütter's name in the 2.4.1 section of CHANGES.txt
appears to be Latin-1, but assumes that CHANGES.txt is encoded as UTF-8, so
the resulting Changes.html has an improperly encoded "ü" (lowercase "u" with an umlaut):

    14. LUCENE-1186: Add Analyzer.close() to free internal ThreadLocal
    (Christian Kohlsch�tter via Mike McCandless)

For me, both in the web browser and in the excerpt from it that I've pasted above, instead
of a lowercase "u" with an umlaut, I see a small white question mark on a black diamond background,
indicating an invalid UTF-8 byte sequence: byte 0xFC, marking the beginning of a multi-byte
sequence, but then no trailing bytes with the high bit set.

Anyway, I think the fix is simple: edit CHANGES.txt so that "Kohlschütter" is properly encoded
as UTF-8, as the remainder of the file is, then regenerate Changes.html.


View raw message