cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandr Porunov <alexandr.poru...@gmail.com>
Subject Re: Is a blob storage cost of cassandra is the same as bigint storage cost for long variables?
Date Fri, 09 Sep 2016 06:12:35 GMT
Hello Romain,

Thank you very much for the explanation!

I have just run a simple test to compare both situations.
I have run two VM equivalent machines.
Machine 1:
CREATE KEYSPACE "test" WITH REPLICATION = { 'class' : 'SimpleStrategy',
'replication_factor' : 1 };

CREATE TABLE test.simple (
  id bigint PRIMARY KEY
);

Machine 2:
CREATE KEYSPACE "test" WITH REPLICATION = { 'class' : 'SimpleStrategy',
'replication_factor' : 1 };

CREATE TABLE test.simple (
  id blob PRIMARY KEY
);

And have put 13421772 primary keys from 1 to 13421772 in both machines.

Results:
Machine 1: size of the data folder: 495864 bytes
Machine 2: size of the data folder: 495004 bytes

So here is almost no any difference between them (even happened with blob
storage cost 1 MB less).

I am happy about it because I need to store special encoded primary keys
with 80 bits each. So I can use blob as a primary key without hesitation.

Best regards,
Alexandr

On Fri, Sep 9, 2016 at 1:20 AM, Romain Hardouin <romainh_ml@yahoo.fr> wrote:

> Hi,
>
> Disk-wise it's the same because a bigint is serialized as a 8 bytes
> ByteBuffer and if you want to store a Long as bytes into a blob type it
> will take 8 bytes too, right?
> The difference is the validation. The blob ByteBuffer will be stored as is
> whereas the bigint will be validated. So technically the Long is slower,
> but I guess that's not noticeable.
>
> Yes you can use a blob as a partition key. I would use the bigint both
> for validation and clarity.
>
> Best,
>
> Romain
>
>
> Le Mercredi 7 septembre 2016 22h54, Alexandr Porunov <
> alexandr.porunov@gmail.com> a écrit :
>
>
> Hello,
>
> I need to store a "Long" Java variable.
> The question is: whether the storage cost is the same both for store hex
> representation of "Long" variable to the blob and for store "Long" variable
> to the bigint?
> Are there any performance pros or cons?
> Is it OK to use blob as primary key?
>
> Sincerely,
> Alexandr
>
>
>

Mime
View raw message