> Problem: HTML formatter writes numbered entity references
> instead of characters in output encoding (specified in
> "xsl:output" tag), despite the fact that output encoding
> supports these characters.
It shouldn't be a problem, because any web browser should render the
character correctly.
> Formatter writes numbered entity reference for the
> character if the character is greater than the maximum
> character for the encoding (m_maxCharacter), which is
> always 0x7F for non-standard encodings
> (XalanTranscodingServices::getMaximumCharacterValue).
> This makes HTML documents incredibly large when custom
> encoding, added with XMLTransService::addEncoding() is
> used. Produced documents contain numbered entity reference
> for every locale-specific character, because they all have
> codes >0x007Fu.
Yes, this is a known problem with the design of the serializers. I
started working on this about a year ago, but it has not been a high
priority, because very few people have complained about it. You might
want to choose UTF-8 as the output encoding, if the size of the generated
files is too big. Technically, XSLT processors are only required to
support UTF-8 and UTF-16, and fixing this has a potentially significant
performance impact on serialization, because it requires we lookup every
character to determine if the target encoding can represent it.
> If this is a bug, can somebody register it? I couldn't do
> this through JIRA web interface.
As long as you're registered, you should be able to create a bug report.
Dave
---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org
|