commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nguyen Thanh Son Daniel (JIRA)" <j...@apache.org>
Subject [jira] Created: (DIGESTER-120) digesting xml content with NodeCreateRule swallows spaces.
Date Wed, 12 Mar 2008 12:10:46 GMT
digesting xml content with NodeCreateRule swallows spaces.
----------------------------------------------------------

                 Key: DIGESTER-120
                 URL: https://issues.apache.org/jira/browse/DIGESTER-120
             Project: Commons Digester
          Issue Type: Bug
    Affects Versions: 1.8
         Environment: jdk 1.4.2_08, digester 1.8
            Reporter: Nguyen Thanh Son Daniel


i need to process an xml file that contains entities: ie:

<?xml version="1.0" encoding="UTF-8"?>
<top>
<body>&#65; &#65;</body>
</top>

i'm using digester as follows:

Digester digester = new Digester ();
digester.addRule ("top", new ObjectCreateRule (MyContent.class));
digester.addRule ("top/body", new NodeCreateRule ());
digester.addSetNext ("top/body", "setBody");

then
...
digester.parse (file);

MyContent class transforms the node into text as follows:

public class MyContent
{
 public void setBody (Element node)
 {
  String content = serializeNode (node);
  System.out.println (content);
 }
 ...
}

the content displayed is in this case: <body>AA</body>

if the body was encoded in the xml file as: <top><body>A A</body></top>,
the content would then be correctly displayed as: 
<body>A A</body>

looking at the NodeCreateRule.NodeBuilder.characters () implementation, the following code
generates the problem: 
String str = new String(ch, start, length);
if (str.trim().length() > 0) { 
 top.appendChild(doc.createTextNode(str));

when entities are being used; the characters () method is called for 'A', ' ' and 'A' in the
first case. in the second case, it is called once with 'A A'.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message