cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Romain Hardouin <>
Subject Re: Is a blob storage cost of cassandra is the same as bigint storage cost for long variables?
Date Fri, 09 Sep 2016 15:27:42 GMT
Note that LZ4 compression is used by default. If you want to disable compression you can do
this:CREATE/ALTER TABLE ... WITH compression = { 'sstable_compression' : '' };

    Le Vendredi 9 septembre 2016 8h12, Alexandr Porunov <>
a écrit :

 Hello Romain,
Thank you very much for the explanation!
I have just run a simple test to compare both situations.I have run two VM equivalent machines.Machine
1:CREATE KEYSPACE "test" WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor'
: 1 };
CREATE TABLE test.simple (  id bigint PRIMARY KEY);
Machine 2:CREATE KEYSPACE "test" WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor'
: 1 };
CREATE TABLE test.simple (  id blob PRIMARY KEY);
And have put 13421772 primary keys from 1 to 13421772 in both machines.
Results:Machine 1: size of the data folder: 495864 bytesMachine 2: size of the data folder:
495004 bytes
So here is almost no any difference between them (even happened with blob storage cost 1 MB
I am happy about it because I need to store special encoded primary keys with 80 bits each.
So I can use blob as a primary key without hesitation.
Best regards,Alexandr
On Fri, Sep 9, 2016 at 1:20 AM, Romain Hardouin <> wrote:

Disk-wise it's the same because a bigint is serialized as a 8 bytes ByteBuffer and if you
want to store a Long as bytes into a blob type it will take 8 bytes too, right?The difference
is the validation. The blob ByteBuffer will be stored as is whereas the bigint will be validated.
So technically the Long is slower, but I guess that's not noticeable.
Yes you can use a blob as a partition key. I would use the bigint both for validation and

    Le Mercredi 7 septembre 2016 22h54, Alexandr Porunov <>
a écrit :


I need to store a "Long" Java variable.The question is: whether the storage cost is the same
both for store hex representation of "Long" variable to the blob and for store "Long" variable
to the bigint?Are there any performance pros or cons?Is it OK to use blob as primary key?


View raw message