tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filip Hanik - Dev Lists <devli...@hanik.com>
Subject Re: DTrace analysis of TC-Cluster Multicast
Date Sat, 25 Feb 2006 22:05:45 GMT
Cool Rainer, interesting numbers. Can you run the same numbers on the 
new module groupcom.
I have done some optimizations there, for example, the name isn't 
broadcasted. And I am about to remove the domain, instead of having a 
default one. also, I agree, lets not translate the 4byte IP and the 
4byte port when receiving it. instead, member comparisons can be done on 
the byte array level, excellent suggestion.

After that I won't be spending to much focus on this issue, as a 
heartbeat do not really have to be optimized. The heartbeat runs *very* 
infrequently and it not a critical piece.

Instead I am gonna start reworking the data senders, there are way too 
many synchronized calls locking down the sockets, the code is somewhat 
cluttered then I am gonna move on to serialization. In the current 
cluster module, too much gets serialized, in the new module "ha" I have 
already removed a lot of serialization and will continue stripping it out.

Filip

Rainer Jung wrote:
> Hi,
>
> I started to trace TC cluster with DTrace. I did a first simple exam for
> the mcast subcomplex. CPU usage and especially elapsed times might be
> slightly larger than expected, because I used the more intrusive
> extended DTrace probes.
>
> All tests were done with JDK 1.6.0 build 72 on Solaris 10 SPARC, using a
> T2000 system (8 cores with 4 Threads each).
>
> Method                          allocation    elapsed      cpu
> mcast/McastServiceImpl.send()        1328B       40ms    6.7ms
> mcast/McastServiceImpl.receive()  744-760B   23-510ms    3.5ms
>
> So this does not look very bad, but I inspected, where cpu time is used
> and objects get allocated:
>
> 1) send cpu time, elapsed time and allocation, per send
>
>   cpu elapsed alloc
> 6.7ms  40.3ms 1328B  McastServiceImpl.send()
>
> 6.3ms  37.6ms 1296B    McastMember.getData()
>
> 1.3ms   7.6ms  136B       ... getName().getBytes();
> 0.9ms   5.3ms  120B       ... getDomain().getBytes();
> 3.6ms  21.2ms  880B       ... java.net.InetAddress.getByName(host)...
>
> It's interesting, that the last three statements are responsible for 85%
> of the allocation and 90% of the cpu usage. This seems unnecessary,
> because for sending multicast heartbeat, name, domain and host do not
> change, so they could be buffered easily.
>
> 2) receive cpu time, elapsed time and allocation, per receive
>
>   cpu  elapsed    alloc
> 3.5ms 23-510ms 744-760B McastServiceImpl.receive()
>
> 0.1ms  3-485ms       0B   socket.receive()
> 2.8ms   17.9ms     640B   McastMember.getMember()
>
> 1.4ms    8.1ms  160B        new String(named),
> 1.0ms    5.6ms  128B        new String(domaind),
> 0.1ms    0.6ms  112B        addressToString(addr),
>
> 0.3ms    2.7ms   16B      ... membership.memberAlive(m)  [Only 50%]
> 0.2ms    2.3ms   32B      ... membership.expire(timeToExpiration);
> ...
>
> The calls to memberAlive are much faster in ~50% of the calls.
>
> Here most interesting is the fact, that the string construction takes
> 70% of total cpu time. I inspected, why this is so. For the next trace I
> had to increase the depth I traced, so elapsed times go a little up. The
> columns are:
>
> - ">" or "<": ">" means start of method, "<" end of method
> - stack depth relative to the receive method
> - cpu time
> - elapsed time
> - allocated bytes
> - class and method
>
>> 3     -      -    - java/lang/String.<init>
>> 4     -      -    - java/lang/String.<init>
>> 5     -      -    - java/lang/StringCoding.decode
>> 6     -      -    - java/lang/StringCoding.decode
>> 7     -      -    - java/lang/StringCoding.deref
> < 7 0.1ms  0.7ms    0 java/lang/StringCoding.deref
>> 7     -      -    - java/lang/StringCoding$StringDecoder.decode
>> 8     -      -    - java/nio/ByteBuffer.wrap
> < 8 0.1ms  0.5ms   48 java/nio/ByteBuffer.wrap
>> 8     -      -    - java/nio/CharBuffer.wrap
> < 8 0.1ms  0.5ms   48 java/nio/CharBuffer.wrap
>> 8     -      -    - java/nio/charset/CharsetDecoder.decode
> < 8 0.7ms  4.4ms    0 java/nio/charset/CharsetDecoder.decode
> < 7 1.1ms  7.8ms  160 java/lang/StringCoding$StringDecoder.decode
> < 6 1.2ms  9.1ms  160 java/lang/StringCoding.decode
> < 5 1.2ms  9.5ms  160 java/lang/StringCoding.decode
> < 4 1.3ms 10.1ms  160 java/lang/String.<init>
> < 3 1.3ms 10.2ms  160 java/lang/String.<init>
>
> So we can see, that most of the cpu time is needed inside
>
> - java/nio/charset/CharsetDecoder.decode: 0.7ms
> - java/lang/StringCoding.deref: 0.1ms
> - java/nio/ByteBuffer.wrap: 0.1ms
> - java/nio/CharBuffer.wrap: 0.1ms
>
> Maybe it would help to use a more byte-oriented format in the heartbeat
> messages and to not convert from it. String conversions might be
> restrictable to the case when logging in debug mode.
>
> Cheers,
>
> Rainer
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Mime
View raw message