cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Øie <k...@gan.no>
Subject RE: urgent encoding problem...
Date Wed, 12 Dec 2001 12:14:02 GMT
it's not that people don't bother to answer you but a lot of people here don't have any experience
with shift-jis encoding. as a Norwegian I have the same problem, non Scandinavians can hardly
reproduce problems revolving Scandinavian-characters.

when it comes to your string problem there can be several sources. first of all you can test
the dom by feeding it a string that has been created with a declared encoding, like :

new String( "æ e trønder æ å" ); will not work on all jdks/platforms
new String( "æ e trønder æ å", "UTF-16" ); will work on most sane jdks/platforms

try to create all your strings with shift_jis forced, just in case. second find out weither
StringWriter does support shift_jis, as far as i know StringWriter are working on chars and
strings and should support shift_jis if all strings fed to it is shift_jis created. lastly
there is some problems regarding the PrintWriter that the servlet api are using to return
serialized content to the browser, try to serialize to a file instead of to the browser, if
the file accepts shift_jis then you should look up fixes/gotchas regarding shift_jis and jsp
as cocoon are using the jsp mechanism to send the response back to the user.

the best place to start looking is the xalan faqs and docs because if you use the xml or html
serializer it's using the xalan implementations.

mvh karl øie



  -----Original Message-----
  From: Arun.N [mailto:arun.n@eximsoft.com]
  Sent: 12. desember 2001 12:48
  To: cocoon-users@xml.apache.org
  Subject: Re: urgent encoding problem...


  Hi all,
              First of all i thank everybody for not bothering to reply. I corrected the second
and the third problem. If the list is still alive and anyone cares to give me solution for
the first problem please do reply.....
  thankx,
  Arun.N
   
    ----- Original Message ----- 
    From: Arun.N 
    To: cocoon-users@xml.apache.org 
    Sent: Tuesday, December 11, 2001 1:31 PM
    Subject: urgent encoding problem...


    Hi all,
                I have some problems with the xsp pages and encoding. When i try to display
Shift_JIS encoded characters it is not displaying properly.
    when i hard code the japnese characters it is working properly. for example in this xsp
page
     
    <?xml version="1.0" encoding="Shift_JIS"?>
    <?cocoon-process type="xsp"?>
    <?cocoon-process type="xslt"?>
    <?xml-stylesheet href="xsl/viewMail-to-html.xsl" type="text/xsl" ?>
    <xsp:page
      language="java"
      encoding="Shift_JIS"
      xmlns:xsp="http://www.apache.org/1999/XSP/Core"
      xmlns:request="http://www.apache.org/1999/XSP/Request"
      xmlns:util="http://www.apache.org/1999/XSP/Util" 
     >
    <page>
       <title>melpo View Mail</title>
      <body>
            <label>‚ ‚È‚½‚ÌPC‚Ì’†‚̃[ƒ‹ƒNƒ‰ƒCƒAƒ“ƒg‚ªÄŠJ‚³‚ê‚Ü‚µ‚½B
</label>
        </body>
    </xsp:page>
     
    the display html is working fine and the characters are working properly .. but the source
of the html shows 
    <html>
        <body>
        &#12354;&#12394;&#12383;&#12398;PC&#12398;&#20013;&#12398;&#12513;&#12540;&#12523;&#12463;&#12521;&#12452;&#12450;&#12531;&#12488;&#12364;&#20877;&#38283;&#12373;&#12428;&#12414;&#12375;&#12383;&#12290;

        </body>
    </html>
    <!-- This page was served in 2278 milliseconds by Cocoon 1.8.2 -->
     
    but why is the characters converted into numbers. the problem i have here is this consumes
more bytes .. so if the device has some size limitations of the source of the page then it
is a problem. if the characters are left same way then it would consume less bytes for the
source page.
     
     
    The second problem is, when i dynamically include xml in my xsp it is not working. But
the same string when hardcode in the xsp page it is working fine.
    <?xml version="1.0" encoding="Shift_JIS"?>
    <?cocoon-process type="xsp"?>
    <?cocoon-process type="xslt"?>
    <?xml-stylesheet href="xsl/viewMail-to-html.xsl" type="text/xsl" ?>
    <xsp:page
      language="java"
      encoding="Shift_JIS"
      xmlns:xsp="http://www.apache.org/1999/XSP/Core"
      xmlns:request="http://www.apache.org/1999/XSP/Request"
      xmlns:util="http://www.apache.org/1999/XSP/Util" 
     >
    <page>
       <title>melpo View Mail</title>
      <body>
            <xsp:logic>
                 String xml = (String) request.getAttribute(xml);
                <xsp:content>
                   <util:include-expr><util:expr>xml</util:expr></util:include-expr>
 // this will append an xml string like <label>‚ ‚È‚½‚ÌPC‚Ì’†‚̃[J‚³‚ê‚Ü‚µ‚½B
</label>          
                </xsp:content>
            </xsp:logic>
        </body>
    </xsp:page>
     
    i am getting an error 
     
    org.xml.sax.SAXException: An invalid XML character (Unicode: 0x13) was found in the element
content of the document. [FATAL ERROR] [File: "null" Line: 1 Column: 109] (nested exception:
org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x13) was found in the element
content of the document.)
    But the string i am getting if hardcoded itzworking fine. because whrn i hardcode it,
the xsp page when getting compiled, it is converting all the characters to those numbers.
and whenever the string is dynamically included the it is not working..................................
     
    and the third problem is ,
        when i load a string to a dom andthen get back the string the encoding information
is gone.The characers displayed are ???????????
            String fullXml = "<?xml version=\"1.0\" encoding=\"Shift_JIS\"?><Response><Message>Mail
Client in your PC has been ƒƒOƒAƒEƒg Restarted ƒGƒLƒTƒCƒg : –|–󁄗˜—p‹K–ñ
xxx </Message></Response>";
     
          DOMParser parser = new DOMParser();
          InputStream is = new ByteArrayInputStream(fullXml.getBytes());
          InputSource isource=new InputSource(is);
          parser.parse(isource);
          Document xmlDoc= parser.getDocument();       //created an dom
     ------------ doing some manipulation ------------------
          OutputFormat    format  = new OutputFormat( xmlDoc );   //Serialize DOM
          StringWriter  stringOut = new StringWriter();           //Writer will be a String
          XMLSerializer    serial = new XMLSerializer( stringOut, format );
          serial.asDOMSerializer();                               // As a DOM Serializer
          serial.serialize( xmlDoc.getDocumentElement() );
          String returnXML = stringOut.toString();  // got back the xml as String.
     
    now if i display the string " returnXML " all the japanese characters are gone. the output
is only "???????????"
     
    Can any of you please give a solution for these problems, as it is very urgent for me.
I have been trying to solve theses isuues from past 2 days and have searched mail archives
i was not able to find a solution.
     
    Thankx in Advance
     
    regards,
    Arun.N,

Mime
View raw message