incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Why Cassandra is "space inefficient" compared to MySQL?
Date Tue, 25 May 2010 18:27:15 GMT
the only place we use a java serializer is for the BitSet in bloom filters.

On Tue, May 25, 2010 at 12:37 PM, Chris Goffinet <goffinet@digg.com> wrote:
> My money is on the fact that the serializer is just horribly verbose. It's
> using a basic set of the java serializer.
> -Chris
>
>
> On Tue, May 25, 2010 at 10:02 AM, Ryan King <ryan@twitter.com> wrote:
>>
>> Also, timestamps for each column.
>>
>> -ryan
>>
>> On Tue, May 25, 2010 at 5:41 AM, Jonathan Ellis <jbellis@gmail.com> wrote:
>> > That's true.  But fundamentally Cassandra is expected to use more
>> > space than mysql for a few reasons; usually the biggest factor is that
>> > Cassandra has to write out each column name in each row, since column
>> > names are dynamic unlike in mysql where you declare the columns once
>> > for the whole table.
>> >
>> > 2010/5/25 Peter Schüller <scode@spotify.com>:
>> >>> Could you please tell me why?
>> >>
>> >> There might be pending sstable removals on disk, which won't happen
>> >> until GC or restart. If you just did a bulk insert and checked
>> >> diskspace immediately afterwards, I think this is a possible
>> >> explanation.
>> >>
>> >> (See "Write path" on
>> >> http://wiki.apache.org/cassandra/ArchitectureInternals)
>> >>
>> >> --
>> >> / Peter Schuller aka scode
>> >>
>> >
>> >
>> >
>> > --
>> > Jonathan Ellis
>> > Project Chair, Apache Cassandra
>> > co-founder of Riptano, the source for professional Cassandra support
>> > http://riptano.com
>> >
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message