incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <>
Subject Re: BytesType vs. UTF8Type
Date Wed, 22 Jun 2011 10:34:01 GMT
On Wed, Jun 22, 2011 at 11:19 AM, Jeesoo Shin <> wrote:
> BytesType vs. UTF8Type. which is better in performance?
> I assume Bytes be faster in compare.. but how much faster is it?

They don't differ at all as far as comparison is involved. They actually use the
exact same function to do the compare. The only thing that UTF8Type does
is to validate the inputs from the client to be actual UTF8 strings.
The validation
happens during insertion requests and so UTF8Type will use a few more cpu
cycles there, but so few that it doesn't matter.

If you know that everything will be UTF8, then I suggest UTF8Type if
only to make
sure you never screw up and send arbitrary bytes there.

Lastly, note that because the comparison between them is the same, and since
BytesType is more permissive, switching from UTF8Type to BytesType is not
a problem if you ever wish too.


> For large large large data set, will it have significant different?
> I love to use UTF8 and be able to read value from cli.  :-)
> *IF* it doesn't degrade performance too much.
> ps. planning to use 0.8.x
> TIA.

View raw message