cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom van den Berge <...@drillster.com>
Subject Re: Unexplainably large reported partition sizes
Date Sun, 06 Mar 2016 12:33:30 GMT
No, data is hardly ever deleted from this table. The cfstats conform this.
The data is also nog reinserted.
Op 5 mrt. 2016 6:20 PM schreef "DuyHai Doan" <doanduyhai@gmail.com>:

> Maybe tombstones ? Do you issue a lot of DELETE statements ? Or do you
> re-insert in the same partition with different TTL values ?
>
> On Sat, Mar 5, 2016 at 7:16 PM, Tom van den Berge <tom@drillster.com>
> wrote:
>
>> I don't think compression can be the cause of the difference, because of
>> two reasons:
>>
>> 1) The partition size I calculated myself (3 MB) is the uncompressed
>> size, and so is the reported size (2.3 GB)
>>
>> 2) The difference is simply way too big to be explained by compression,
>> even if the calculated size would have been the compressed size. The
>> compression would be 0.125% of the original, which is not realistic. In the
>> logs, I can see that the typical compression that is achieved for this
>> table is around 80% of the original.
>>
>> Tom
>>
>> On Fri, Mar 4, 2016 at 9:48 PM, Robert Coli <rcoli@eventbrite.com> wrote:
>>
>>> On Fri, Mar 4, 2016 at 5:56 AM, Tom van den Berge <tom@drillster.com>
>>> wrote:
>>>
>>>>  Compacting large partition
>>>> drillster/subscriberstats:rqtPewK-1chi0JSO595u-Q (1,470,058,292 bytes)
>>>>
>>>> This means that this single partition is about 1.4GB large. This is
>>>> much larger that it can possibly be, because of two reasons:
>>>>   1) the partition has appr. 50K rows, each roughly 62 bytes = ~3 MB
>>>>   2) the entire table consumes appr. 500MB of disk space on the node
>>>> containing the partition (including snapshots)
>>>>
>>>> Furthermore, nodetool cfstats tells me this:
>>>> Space used (live): 253,928,111
>>>> Space used (total): 253,928,111
>>>> Compacted partition maximum bytes: 2,395,318,855
>>>> The space used seem to match the actual size (excl. snapshots), but the
>>>> Compacted partition maximum bytes (2,3 GB) seems to be far higher than
>>>> possible. Does anyone know how it is possible that Cassandra reports such
>>>> unlikely sizes?
>>>>
>>>
>>> Compression is enabled by default, and compaction reports the
>>> uncompressed size.
>>>
>>> =Rob
>>>
>>>
>>
>>
>>
>> --
>> Tom van den Berge
>> Lead Software Engineer
>>
>> [image: Drillster]
>>
>> Middenburcht 136
>> 3452 MT Vleuten
>> Netherlands +31 30 755 53 30
>> www.drillster.com
>>
>> [image: Follow us on Facebook] Follow us on Facebook
>> <https://www.facebook.com/Drillster>
>>
>
>

Mime
View raw message