cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Folke Behrens (JIRA)" <j...@apache.org>
Subject [jira] Created: (CASSANDRA-1232) UTF8Type.compare() is slow and dangerous
Date Sat, 26 Jun 2010 17:54:49 GMT
UTF8Type.compare() is slow and dangerous
----------------------------------------

                 Key: CASSANDRA-1232
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1232
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Folke Behrens


UTF8Type converts both byte arrays into Strings and then compares them. This is unnecessary
and slow because UTF-8 encoded Strings are already directly comparable. Higher codepoints
yield higher initial and subsequent bytes. One can safely use BytesType.compare() for UTF-8.
Maybe UTF8Type should be a subclass only overriding getString().

BTW, It's also dangerous to ignore invalid byte sequences. At this point the byte array should
contain valid UTF-8.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message