commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nguyen Thanh Son Daniel (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DIGESTER-120) digesting xml content with NodeCreateRule swallows spaces.
Date Sat, 15 Mar 2008 14:08:25 GMT

    [ https://issues.apache.org/jira/browse/DIGESTER-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579044#action_12579044
] 

Nguyen Thanh Son Daniel commented on DIGESTER-120:
--------------------------------------------------

oops.

Simon,

I forgot:
I was not so clear about it, but to reproduce the problem, you must use entities in your xml.
the file following should help reproduce the problem:

<?xml version="1.0" encoding="UTF-8"?>
<top>
<body>&#65; &#65;</body>
</top>


> digesting xml content with NodeCreateRule swallows spaces.
> ----------------------------------------------------------
>
>                 Key: DIGESTER-120
>                 URL: https://issues.apache.org/jira/browse/DIGESTER-120
>             Project: Commons Digester
>          Issue Type: Bug
>    Affects Versions: 1.8
>         Environment: jdk 1.4.2_08, digester 1.8
>            Reporter: Nguyen Thanh Son Daniel
>         Attachments: digester-patch.txt
>
>
> i need to process an xml file that contains entities: ie:
> <?xml version="1.0" encoding="UTF-8"?>
> <top>
> <body>&#65; &#65;</body>
> </top>
> i'm using digester as follows:
> Digester digester = new Digester ();
> digester.addRule ("top", new ObjectCreateRule (MyContent.class));
> digester.addRule ("top/body", new NodeCreateRule ());
> digester.addSetNext ("top/body", "setBody");
> then
> ...
> digester.parse (file);
> MyContent class transforms the node into text as follows:
> public class MyContent
> {
>  public void setBody (Element node)
>  {
>   String content = serializeNode (node);
>   System.out.println (content);
>  }
>  ...
> }
> the content displayed is in this case: <body>AA</body>
> if the body was encoded in the xml file as: <top><body>A A</body></top>,
the content would then be correctly displayed as: 
> <body>A A</body>
> looking at the NodeCreateRule.NodeBuilder.characters () implementation, the following
code generates the problem: 
> String str = new String(ch, start, length);
> if (str.trim().length() > 0) { 
>  top.appendChild(doc.createTextNode(str));
> when entities are being used; the characters () method is called for 'A', ' ' and 'A'
in the first case. in the second case, it is called once with 'A A'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message