santuario-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raul Benito <r...@apache.org>
Subject Re: Status of == vs equals() RESULTS
Date Tue, 10 Aug 2010 08:46:19 GMT
As the original author of the changes of equals to == in intern namespaces,
I can tell that original in 1.4 and 1.5 and with my data (that was the
verification of a SAML/Liberty AuthnReq in a multi thread tests, and the old
Juice JCE provider). The change was 10% to 20% faster.
The SAML is one of the real example of signing and has some url with common
prefixes and same length url.
The Juice provider also helps to get rid of the signing/digest cost (a
verification is two c14n one of the signing part and c14n of the signature),
but i think just a c14n is a good way of measure it.
Also take into account that the == vs equals debate is more a memory
workload cache problem, if we have to iterate over and over every char just
to see if it is not equals, we trash the cache (That's why i used the multi
thread to simulate a server decoding requests with more or less the same
code, but in different times and different "workload")
Nevertheless  if you have test with a more modern jre and the code .equals
is behaving better, just go ahead and kiss goodbye to  the ==.

Clive, using the .hashCode for strings in this case is not a big speed-up as
it is going to go through all the chars of the string, trashing cache again,
and multiplying and adding the result to an integer, instead of a fail in
the first different char or just summarize to a boolean.\

Regards,


On Tue, Aug 10, 2010 at 2:37 AM, Clive Brettingham-Moore <
xmlsec@brettingham-moore.net> wrote:

> Have to agree .equals is the way to go, since correctness of == is too
> reliant on what must be considered implementation optimisations in the
> parser.
>
> Benchmarking in JVM is notoriously difficult, but it does look like
> there is no gross difference, which should kill any objections to doing
> it correctly.
>
> Since I recently spend far to long researching this for an unrelated
> problem I'll add my 10c to the detail discussion.
>
> On 10/08/10 01:23, Chad La Joie wrote:
>
> > Not necessarily, there are a number of not equal checks in there that
> > should, in theory, perform better if you only use == only.  In such a
> > case, the use of != will just be a single check while !equals() will
> > result in a char-by-char comparison.
>
> Actually, the next thing String.equals tests is length equality - so
> character comparison will only be reached if the strings are the same
> length.
>
> Since the char by char comparison returns on the first mismatch, then
> only same length strings with shared prefixes will show the expected
> slowness. (namespace URIs are likely to share prefixes, but I think are
> not particularly likely to be the same length, unless actually equal)...
> thus String.equals is only likely to be slow where comparing long
> distinct but equal strings (so intern or alternative string pooling
> techniques needed for == benefit .equals without all the nasty
> loopholes: even if .equals is occasionally slow, at least it is always
> right).
>
> In circumstances where doing repeated tests with many length and prefix
> matches, adding a hash code inequality test ((s1.hashCode()==
> s2.hashCode())&&s1.equals(s2)) could prevent practically all
> char-by-char checks for !equal cases (but if the same strings are never
> repeatedly used, the hash code calculation could be an issue; nb intern
> results in hash calculation for all strings anyway)... pooling is still
> needed to speed up matches for equality though.
>
> Re VM options I would feel -server is definitely the right test bed,
> both because of the more aggressive JIT, and also because the code is
> likely to see heaviest real world cases in -server VMs.
>
>

Mime
View raw message