> > Hmm. It still happens, that different JREs (?) produce different iso-2022-jp
> > output (i.e. any time someone builds all and diffs, he gets .ja.jis diffs.
>
> Well, at least mine removes bogus escape sequences and
> produce more desirable output but yeah, it still happens.
Last few months, I've encountered some bugs of the implementation of
iso-2022-jp charset converter of Sun JRE, but the converter will be soon
stable I think.
I'm working on the input XML files, and I'm not watching the generated
html files. I feel the diffs of the htmls are not so important than
those of the xmls.
I can just ignore the diffs of generated html files right now.
Well, I don't understand what the diffs do harm to us, so can I ask some
reasons?
> > I'd suggest to switch the transformation finally to shift_jis, which is more
> > stable (because there are none of these problematic escape sequences).
>
> I'd rather use euc-jp than shift_jis. For one thing,
> shift_jis is a nightmare for auto detection since almost all
> byte sequence can represent a valid character. If I choose
> from three major character encoding scheme in Japan, I
> always choose euc-jp. It doesn't have quirks sjis has. The
> fact that current one uses iso-2022-jp is just from legacy
> reasons.
IMHO, whatever charset we choose, more or less, we will face this kind of
problem.
# I, myself prefer UTF8. :-)
## Because it support wide area of characters.
But, shift_jis is actually worse choise because there're well known
issuses around Shift_JIS and CP932 charsets.
The alias definition changed and changed between the release of Java.
I have no strong push which charset to be (except shift_jis).
---Hiroaki Kawai
---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org
|