santuario-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jason marshall" <jdmarsh...@gmail.com>
Subject Re: Signed document can be corrupted in certain circumstances
Date Tue, 27 Feb 2007 23:35:19 GMT
On 2/27/07, Raul Benito <raul@apache.org> wrote:
> Hi Jason,
>
> Sorry for the delay.
> See my comments inline
>
> On 2/23/07, jason marshall <jdmarshall@gmail.com > wrote:
> > Raul,
> >
> > I'm not sure I can be as helpful as Yvan, having a more modest and
> > polite test suite, but I have a bit of Unicode and specifically UTF-8
> > en/decoding experience, and I might be able to make a few
> > observations.  I'm curious about your comments about how some Unicode
> > characters are not being handled properly.  Which ones are you having
> > trouble with?  The new 32 bit characters, 0, something else?
>
> Great, you help is really appreciated. I have just create a test that checks
> my encoding against implementation the String.getBytes("UTF-8") for the
> first 2**16 chars , and they are all equal but character 0xd8ff.
>

Okay, you're a bit outside of my realm here, but have you considered
the possibility that the error is Sun's, and not yours?  0xd800-0xd8ff
don't appear to be characters, per se, they're the first 2 bytes of a
4-byte sequence.   You might need to look to a third party for
confirmation of the right encoding for that character (one page
suggests 0xED 0xA3 0xBF is correct).

> > You say in your comments that the problem is fixed in HEAD, but I'm
> > looking at HEAD
> >
> >
> http://svn.apache.org/viewvc/xml/security/trunk/src/org/apache/xml/security/c14n/implementations/CanonicalizerBase.java?view=markup
> >
> > And the code still seems to be using 8th bit checks throughout.
>
>  Can you point me where do you think is incorrect? or give me a test case? I
> will really appreciated it.

Just search for "0x80".  On the revision I reviewed, there were a
number of them in that file, and almost all of them were wrapped
around a call to UtfHelpper

>
> > If you want to make this code go faster, your better bet is to split
> > up the methods in UTFHelpper so that Hotspot can inline the fast-path
> > into the the callers.  That'll get you the same effect with saner
> > code.  For example:
> >
>
> Great idea I will try & do it.
>
> > I'm pretty sure that even the 1.3 Hotspot will be happy with this
> > code, but I haven't tested it (I'm having some trouble building the
> > code from the source release, and work doesn't allow svn access
> > through the firewall, for various reasons, a couple of which are
> > understandable).
>
> We should create some nightly build & publish mechanism. I will try to see
> how other projects handle this.
>
>
> > Good luck, and keep us posted on your ETA for a 1.4.1 release.
>
>
> Thanks,
> I will try to see how much bug reports we got (the MS-Office bug looks
> promising)
>
> Regards,
>
> Raul
>
> > Thanks,
> > Jason
> >
> >

-- 
- Jason

Mime
View raw message