lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven A Rowe <sar...@syr.edu>
Subject RE: [ANNOUNCE] Lucene Java 2.4.1 released
Date Mon, 09 Mar 2009 21:17:47 GMT
On 3/9/2009 at 5:10 PM, Michael McCandless wrote:
> OK this is now fixed.  Thanks Steve!

You've proven wrong the assertion that getting encoding right is a thankless task :).

Steve

> Steven A Rowe wrote:
> 
> > Hi Mike,
> >
> > On 3/9/2009 at 2:34 PM, Michael McCandless wrote:
> >> See changes at
> http://lucene.apache.org/java/2_4_1/changes/Changes.html
> >
> > Minor nit: the encoding of Christian Kohlschütter's name in the
> > 2.4.1 section of CHANGES.txt appears to be Latin-1, but
> > changes2html.pl assumes that CHANGES.txt is encoded as UTF-8, so the
> > resulting Changes.html has an improperly encoded "ü" (lowercase "u"
> > with an umlaut):
> >
> >    14. LUCENE-1186: Add Analyzer.close() to free internal ThreadLocal
> >    resources.
> >    (Christian Kohlsch�tter via Mike McCandless)
> >
> > For me, both in the web browser and in the excerpt from it that I've
> > pasted above, instead of a lowercase "u" with an umlaut, I see a
> > small white question mark on a black diamond background, indicating
> > an invalid UTF-8 byte sequence: byte 0xFC, marking the beginning of
> > a multi-byte sequence, but then no trailing bytes with the high bit
> > set.
> >
> > Anyway, I think the fix is simple: edit CHANGES.txt so that
> > "Kohlschütter" is properly encoded as UTF-8, as the remainder of the
> > file is, then regenerate Changes.html.
> >
> > Steve
Mime
View raw message