cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-2850) Converting bytes to hex string is unnecessarily slow
Date Mon, 04 Jul 2011 12:27:22 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sylvain Lebresne updated CASSANDRA-2850:
----------------------------------------

    Attachment: 2850-v2.patch

Attaching a so-called v2 version that avoids the string object creation of
each byte by encodind each char separately. This version shows a >30% speedup
on the 10MB array conversion (and ~15% speedup on the 1K array conversion)
compared to the version of the previous patch. It also will generate less
garbage.

I've also broaden the scope of this ticket because hexToBytes also need some
love (actually even more so) and the v2 patch ships with a improved version of
hexToByte. As it turns out hexToByte was really naive and was using
substring() on every 2 characters, generating a lot of String objects. On a
micro-benchmark converting strings of 1000 characters, the attached version
shows a ~13x (!) speedup improvement. It also generate much less garbage.

To add to what David said, let's note that those methods used to not matter
too much (they were used non performance sensitive places, like debug/error
messages, or SSTable2json (though performance in those tools don't hurt)), but
are now used by CQL for BytesType.

> Converting bytes to hex string is unnecessarily slow
> ----------------------------------------------------
>
>                 Key: CASSANDRA-2850
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2850
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.6, 0.8.1
>            Reporter: David Allsopp
>            Priority: Minor
>             Fix For: 0.8.2
>
>         Attachments: 2850-v2.patch, BytesToHexBenchmark.java, BytesToHexBenchmark2.java,
cassandra-2850a.diff
>
>
> ByteBufferUtil.bytesToHex() is unnecessarily slow - it doesn't pre-size the StringBuilder
(so several re-sizes will be needed behind the scenes) and it makes quite a few method calls
per byte.
> (OK, this may be a premature optimisation, but I couldn't resist, and it's a small change)
> Will attach patch shortly that speeds it up by about x3, plus benchmarking test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message