cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Allsopp (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (CASSANDRA-2850) Converting bytes to hex string is unnecessarily slow
Date Mon, 04 Jul 2011 19:47:21 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059589#comment-13059589
] 

David Allsopp edited comment on CASSANDRA-2850 at 7/4/11 7:46 PM:
------------------------------------------------------------------

I think you mean (bytes.remaining() * 2) not (bytes.remaining() / 2) - we need twice as many
chars as bytes.

Also, shouldn't byteToChar[] have length 16, not 256?

Not sure what string creation you are referring to?

I attach 2 further versions of bytesToHex (as another benchmark class 3). Results are below
(I've had to increase the number of repeats so the stats are significant!).

v3 uses 'normal' code and is another 20% faster for large values, and _another_ factor of
2 faster than v2, i.e. 7-10 times faster than the original.

v4 uses nasty reflection to avoid doing an arraycopy on the byte array - this avoids a large
chunk of memory (all the previous solutions end up doing an arraycopy somewhere). This is
now 11-13 times faster than the original.

20M old: 1482
20M new: 360
20M  v2: 249
20M  v3: 203
20M  v4: 125
----
old: 2137
new: 859
 v2: 718
 v3: 203
 v4: 156
----
old: 2138
new: 843
 v2: 733
 v3: 188
 v4: 156
----



      was (Author: dallsopp):
    I think you mean (bytes.remaining() * 2) not (bytes.remaining() / 2) - we need twice as
many chars as bytes.

Also, shouldn't byteToChar[] have length 16, not 256.

Not sure what string creation you are referring to?

I attach 2 further versions of bytesToHex (as another benchmark class 3). Results are below
(I've had to increasse the number of repeats so the stats are significant!).

v3 uses 'normal' code and is another 20% faster for large values, and _another_ factor of
2 faster than v2, i.e. 7-10 time sfatser than the original.

v4 uses nasty reflection to avoid doing an arraycopy on the byte array - this avoids a large
chunk of memory (all the previous solutions end up doing an arraycopy somewhere). This is
now 11-13 times fatser than the original.

20M old: 1482
20M new: 360
20M  v2: 249
20M  v3: 203
20M  v4: 125
----
old: 2137
new: 859
 v2: 718
 v3: 203
 v4: 156
----
old: 2138
new: 843
 v2: 733
 v3: 188
 v4: 156
----


  
> Converting bytes to hex string is unnecessarily slow
> ----------------------------------------------------
>
>                 Key: CASSANDRA-2850
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2850
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.6, 0.8.1
>            Reporter: David Allsopp
>            Priority: Minor
>             Fix For: 0.8.2
>
>         Attachments: 2850-v2.patch, BytesToHexBenchmark.java, BytesToHexBenchmark2.java,
BytesToHexBenchmark3.java, cassandra-2850a.diff
>
>
> ByteBufferUtil.bytesToHex() is unnecessarily slow - it doesn't pre-size the StringBuilder
(so several re-sizes will be needed behind the scenes) and it makes quite a few method calls
per byte.
> (OK, this may be a premature optimisation, but I couldn't resist, and it's a small change)
> Will attach patch shortly that speeds it up by about x3, plus benchmarking test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message