santuario-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jason marshall" <jdmarsh...@gmail.com>
Subject Re: Signed document can be corrupted in certain circumstances
Date Fri, 23 Feb 2007 18:18:09 GMT
Raul,

I'm not sure I can be as helpful as Yvan, having a more modest and
polite test suite, but I have a bit of Unicode and specifically UTF-8
en/decoding experience, and I might be able to make a few
observations.  I'm curious about your comments about how some Unicode
characters are not being handled properly.  Which ones are you having
trouble with?  The new 32 bit characters, 0, something else?

You say in your comments that the problem is fixed in HEAD, but I'm
looking at HEAD

http://svn.apache.org/viewvc/xml/security/trunk/src/org/apache/xml/security/c14n/implementations/CanonicalizerBase.java?view=markup

And the code still seems to be using 8th bit checks throughout.

I think you would be much better off removing the special casing you
added to speed up this class.  Now maybe it's because I'm not encoding
too many really big documents, or maybe it's because I'm fixated on
MessageDigest issues, but I'm not seeing this as a critical
performance problem to begin with.  However even if it were, this is
not a great way to achieve your goal.

If you want to make this code go faster, your better bet is to split
up the methods in UTFHelpper so that Hotspot can inline the fast-path
into the the callers.  That'll get you the same effect with saner
code.  For example:

	final static void writeCharToUtf8(final char c,final OutputStream
out) throws IOException{
	   	if (c < 0x80) {
	        out.write(c);
	    }
            else
            {
                writeMultiByteCharToUtf8(c, out);
            }
       }

       final static protected void writeMultiByteCharToUtf8(final char
c, final OutputStream out)
           throws IOException
       {
	   	if ((c >= 0xD800 && c <= 0xDBFF) || (c >= 0xDC00 && c <=
0xDFFF) ){
        	//No Surrogates in sun java
            ...


I'm pretty sure that even the 1.3 Hotspot will be happy with this
code, but I haven't tested it (I'm having some trouble building the
code from the source release, and work doesn't allow svn access
through the firewall, for various reasons, a couple of which are
understandable).


Good luck, and keep us posted on your ETA for a 1.4.1 release.

Thanks,
Jason


On 2/13/07, Hess Yvan <Yvan.Hess@imtf.ch> wrote:
>
> Hi Raul,
>
> Let me know when you have a pre-realease of version 1.4.1 or send it to me by email;
I will then run all my junit tests cases and give you a feedback. We are using a lot of functionnality
of the XML encryption and signature syntax and for this reason we have interesting test cases
that can help you in the release process of XML security library. I don't have too much time
to follow what happens with the project, but as I said in a previous email, I can try to run
my test cases before you plan to release a new version to get a second feekback concerning
the strongness of the library: 4 eyes is better than 2 eyes :-)
>
> Regards. Yvan
>
>
> -----Original Message-----
> From: raul.benito.garcia@gmail.com [mailto:raul.benito.garcia@gmail.com] On Behalf Of
Raul Benito
> Sent: mardi, 13. février 2007 12:18
> To: security-dev@xml.apache.org
> Subject: Re: Signed document can be corrupted in certain circumstances
>
> Hi Hess,
>
> It is my fault, we have a "critic" bug
> http://issues.apache.org/bugzilla/show_bug.cgi?id=41462 , the problem is that I was thinking
in 8bits instead of 32bits. now it is quite fixed in head but we are having a problem with
some part of unicode. I think I will do a 1.4.1 with this bug and several others.
> And we have to reconsider my release strategy as it seems that nobody, not too many people
test the release candidates :(.
>
>
> On 2/13/07, Hess Yvan <Yvan.Hess@imtf.ch> wrote:
> >
> >
> > Hi everybody,
> >
> > I think I found a critical bug into XML security V1.4.0 (Java). A XML
> > document signed with Apache XML security can be corrupted in certain
> > circumstances.
> >
> > Here are the start conditions and the results I have:
> >
> > 1. XML document encoding in "UTF-8" having a UNICODE character "\u263A"
> > 2. The document is signed with Apache XML security --->  OK 3. The
> > document is verified with Apache XML security --->  OK 4. The document
> > is verified with IBM toolkit (XSS4J) ---> NOT OK
> >
> > Doing some investigation, I think I isolated the problem. It seems
> > that the error is due to the Canonicalizer class. This class doesn't
> > treat correctly
> > UTF-8 characters coded on three bytes. Here is a test I did to confirm
> > the
> > problem:
> >
> >      // XML character \u263A => &#x0263A; => smiley
> >       String xmlString = "<document>Humour document (héhé
> > \u263A)</document>";
> >       byte[] xml = xmlString.getBytes("UTF-8");
> >       String xmlHex = HexadecimalConvertor.toHex(xml);
> >
> >       System.out.println(xmlString);
> >       System.out.println("Hexadecimal value: " + xmlHex);
> >
> >       // Get the DOM document
> >       Document document = new
> > XMLParser().parseXMLDocument(new
> > ByteArrayInputStream(xml));
> >
> >       // Canonical
> >       byte[] canonicalXML =
> > Canonicalizer.getInstance(Canonicalizer.ALGO_ID_C14N_WITH_COMMENTS).canonicalizeSubtree(document);
> >       String canonicalXMLHex = HexadecimalConvertor.toHex(canonicalXML);
> >       String canonicalXMLString = new String(canonicalXML, "UTF-8");
> >
> >       System.out.println("Hexadecimal value: " + canonicalXMLHex);
> >       System.out.println(canonicalXMLString);
> >
> > and here is the result
> >
> > <document>Humour document (héhé ☺)</document>
> > value:
> > 3c646f63756d656e743e48756d6f757220646f63756d656e74202868c3a968c3a920
> > e298ba 293c2f646f63756d656e743e
> > value:
> > 3c646f63756d656e743e48756d6f757220646f63756d656e74202868c3a968c3a920
> > 3a     293c2f646f63756d656e743e
> > <document>Humour document (héhé :)</document>
> >
> > The Canonicalizer class treats correctly the character "é" (E9)
> > converted in
> > UTF-8 as "c3a9". BUT the unicode character "☺" (263A) is converted as ":"
> > (3a) but should be (e298ba); this is wrong. It seems that the
> > Canonicalizer class doesn't manage correctly "UTF-8" characters coded on three bytes
!
> >
> > Anybody has an idea ? Can someboy help me because it occurs in the
> > context of our application and now we have a lot of problems due to this situation.
> >
> > Thanks in advance.
> >
> > Regards. Yvan Hess
> >
> >
> >
> >
> > Yvan Hess
> >
> > Chief Software Architect
> >
> >
> >
> >
> >
> > e-mail: yvan.hess@imtf.ch
> > phone : +41 (0)26 460 66 66
> > fax   : +41 (0)26 460 66 60
> >
> >
> >
> > Informatique-MTF SA
> > Route du Bleuet 1
> > CH-1762 Givisiez
> >
> > Excellence in Compliance and Document Management
> >
> > http://www.imtf.com
> >
> >
> >
> > DISCLAIMER
> > This message is intended only for use by the person to whom it is addressed.
> > It may contain information that is privileged and confidential. Its
> > content does not constitute a formal commitment by IMTF. If you are
> > not the intended recipient of this message, kindly notify the sender
> > immediately and destroy this message. Thank You.
> >
>
>
> --
> http://r-bg.com
>


-- 
- Jason
Mime
View raw message