cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: storage space and compaction speed
Date Sat, 19 Nov 2011 20:35:39 GMT
I'm guessing something else is responsible for the compaction
difference you're seeing -- Bytes, UTF8, and Ascii types all use the
same lexical byte comparison code.  The only place you should expect
to lose a small amount of performance by using the latter two is on
insert when it sanity-checks the input.

On Sat, Nov 19, 2011 at 12:43 PM, Thorsten von Eicken
<> wrote:
> I recently changed the default_validation_class on a bunch of CFs from
> BytesType to UTF8Type and I observed two things: first I saw a number of
> compactions during the migration that showed ~200% to ~400% of original
> in the log entry. Second, it seems that compaction speed has now halved.
> I'm using v1.0.1, level compaction and compression. Before I create
> tests I thought I'd quickly ask: is there any difference in storage
> efficiency between BytesType, UTF8Type, and AsciiType when storing plain
> us-ascii strings? And is there any expected compaction speed difference?
> (It would be nice to have some docs about the expected storage space
> used for the various data types.)
> Thanks much!
> Thorsten

Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support

View raw message