Return-Path: X-Original-To: apmail-xmlgraphics-commits-archive@www.apache.org Delivered-To: apmail-xmlgraphics-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CEF78D1ED for ; Tue, 11 Dec 2012 09:41:46 +0000 (UTC) Received: (qmail 28934 invoked by uid 500); 11 Dec 2012 09:41:46 -0000 Mailing-List: contact commits-help@xmlgraphics.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@xmlgraphics.apache.org Delivered-To: mailing list commits@xmlgraphics.apache.org Received: (qmail 28917 invoked by uid 99); 11 Dec 2012 09:41:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Dec 2012 09:41:46 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Dec 2012 09:41:43 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id 44F87238899C for ; Tue, 11 Dec 2012 09:41:22 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r841657 - in /websites/staging/xmlgraphics/trunk/content: ./ fop/fo.html Date: Tue, 11 Dec 2012 09:41:21 -0000 To: commits@xmlgraphics.apache.org From: buildbot@apache.org X-Mailer: svnmailer-1.0.8-patched Message-Id: <20121211094122.44F87238899C@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: buildbot Date: Tue Dec 11 09:41:20 2012 New Revision: 841657 Log: Staging update by buildbot for xmlgraphics Modified: websites/staging/xmlgraphics/trunk/content/ (props changed) websites/staging/xmlgraphics/trunk/content/fop/fo.html Propchange: websites/staging/xmlgraphics/trunk/content/ ------------------------------------------------------------------------------ --- cms:source-revision (original) +++ cms:source-revision Tue Dec 11 09:41:20 2012 @@ -1 +1 @@ -1420048 +1420053 Modified: websites/staging/xmlgraphics/trunk/content/fop/fo.html ============================================================================== --- websites/staging/xmlgraphics/trunk/content/fop/fo.html (original) +++ websites/staging/xmlgraphics/trunk/content/fop/fo.html Tue Dec 11 09:41:20 2012 @@ -349,14 +349,14 @@ $(document).ready(function () {

Special Characters

When entering special (non-ASCII) characters in XML, the general rule is to use the applicable Unicode character instead of trying to use a character entity as you would with HTML. Remember that HTML is an SGML document type. SGML has a limited character set, which requires it to use character entities to represent special characters. One of the improvements of XML over SGML (and thus HTML) is native support for Unicode. Basic XML has only a handful of character entities, primarily because it doesn't really need more.

Entities such as &uuml; (u with an umlaut), which work in HTML, will be flagged as undefined entities unless you define them yourself in your DTD. Use the corresponding Unicode character instead. A list of predefined HTML entities and their Unicode codepoints can be found at Character entity references in HTML 4.

-

One common example is &nbsp;, used to obtain a non-breaking space in HTML. In XML, use   instead.

+

One common example is &nbsp;, used to obtain a non-breaking space in HTML. In XML, use &#160; or &#xa0; instead.

For other non-ASCII characters, such as the Euro symbol, checkbox, etc., see the Unicode Reference By Name document that is found at the Unicode Consortium site.

-

After finding the correct Unicode codepoint to represent the character, use XML Character References to put the character into your source XML, XSLT or FO. See the non-breaking-space comments above for an example of the syntax using decimal notation. The following hexadecimal example will result in a Euro sign: -€ -Getting your XML correctly encoded is only part of the job. If you want the character to display or print correctly (and you probably do), then the selected font must contain the necessary glyph. Because of differences between font encoding methods, and limitations in some font technologies, this can be a troublesome issue, especially for symbol characters. The FOP example file Base-14 Font Character Mapping is a very useful resource in sorting these issues out for the Base-14 fonts. For other fonts, use font editing sofware or operating system utilities (such as the Character Map in most Windows platforms) to determine what characters the font supports.

+

After finding the correct Unicode codepoint to represent the character, use XML Character References to put the character into your source XML, XSLT or FO. See the non-breaking-space comments above for an example of the syntax using decimal notation. The following hexadecimal example will result in a Euro sign:

+

&#x20AC;

+

Getting your XML correctly encoded is only part of the job. If you want the character to display or print correctly (and you probably do), then the selected font must contain the necessary glyph. Because of differences between font encoding methods, and limitations in some font technologies, this can be a troublesome issue, especially for symbol characters. The FOP example file Base-14 Font Character Mapping is a very useful resource in sorting these issues out for the Base-14 fonts. For other fonts, use font editing sofware or operating system utilities (such as the Character Map in most Windows platforms) to determine what characters the font supports.

An alternative to encoding the character and making it available through a font is to use an embedded graphic to represent the character: GIF, PNG, SVG, etc.

Entity Characters

-

The handful of basic XML character entities that do exist are the ampersand, apostrophe, less-than, greater-than, and single-quote characters. These are needed to distinguish markup tags from content, and to distinguish character entities from content. To avoid parser complaints about illegal characters and entities in your input, ensure that ampersands in text and attributes are written as &, "<" is written as <, and ">" as >. It is not necessary everywhere, but it is wise to do so anyway, just to be sure.

+

The handful of basic XML character entities that do exist are the ampersand, apostrophe, less-than, greater-than, and single-quote characters. These are needed to distinguish markup tags from content, and to distinguish character entities from content. To avoid parser complaints about illegal characters and entities in your input, ensure that ampersands in text and attributes are written as &amp;, "<" is written as &lt;, and ">" as &gt;. It is not necessary everywhere, but it is wise to do so anyway, just to be sure.

Most XML parsers will provide a line number and sometimes a column number for offending characters.

Review the XML Specification or a good tutorial for details of the XML file format.

Encoding Issues

--------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscribe@xmlgraphics.apache.org For additional commands, e-mail: commits-help@xmlgraphics.apache.org