incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: [problem with OOM in nodes]
Date Sun, 23 Sep 2012 18:41:58 GMT
> /var/log/cassandra$ cat system.log | grep "Compacting large" | grep -E
> "[0-9]+ bytes" -o | cut -d " " -f 1 |  awk '{ foo = $1 / 1024 / 1024 ;
> print foo "MB" }'  | sort -nr | head -n 50

> Is it bad signal?
Sorry, I do not know what this is outputting. 

>> As I can see in cfstats, compacted row maximum size: 386857368 !
Yes. 
Having rows in the 100's of MB is will cause problems. Doubly so if they are large super columns.


Cheers



-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 22/09/2012, at 5:07 AM, Denis Gabaydulin <gabaden@gmail.com> wrote:

> And some stuff from log:
> 
> 
> /var/log/cassandra$ cat system.log | grep "Compacting large" | grep -E
> "[0-9]+ bytes" -o | cut -d " " -f 1 |  awk '{ foo = $1 / 1024 / 1024 ;
> print foo "MB" }'  | sort -nr | head -n 50
> 3821.55MB
> 3337.85MB
> 1221.64MB
> 1128.67MB
> 930.666MB
> 916.4MB
> 861.114MB
> 843.325MB
> 711.813MB
> 706.992MB
> 674.282MB
> 673.861MB
> 658.305MB
> 557.756MB
> 531.577MB
> 493.112MB
> 492.513MB
> 492.291MB
> 484.484MB
> 479.908MB
> 465.742MB
> 464.015MB
> 459.95MB
> 454.472MB
> 441.248MB
> 428.763MB
> 424.028MB
> 416.663MB
> 416.191MB
> 409.341MB
> 406.895MB
> 397.314MB
> 388.27MB
> 376.714MB
> 371.298MB
> 368.819MB
> 366.92MB
> 361.371MB
> 360.509MB
> 356.168MB
> 355.012MB
> 354.897MB
> 354.759MB
> 347.986MB
> 344.109MB
> 335.546MB
> 329.529MB
> 326.857MB
> 326.252MB
> 326.237MB
> 
> Is it bad signal?
> 
> On Fri, Sep 21, 2012 at 8:22 PM, Denis Gabaydulin <gabaden@gmail.com> wrote:
>> Found one more intersting fact.
>> As I can see in cfstats, compacted row maximum size: 386857368 !
>> 
>> On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin <gabaden@gmail.com> wrote:
>>> Reports - is a SuperColumnFamily
>>> 
>>> Each report has unique identifier (report_id). This is a key of
>>> SuperColumnFamily.
>>> And a report saved in separate row.
>>> 
>>> A report is consisted of report rows (may vary between 1 and 500000,
>>> but most are small).
>>> 
>>> Each report row is saved in separate super column. Hector based code:
>>> 
>>> superCfMutator.addInsertion(
>>>  report_id,
>>>  "Reports",
>>>  HFactory.createSuperColumn(
>>>    report_row_id,
>>>    mapper.convertObject(object),
>>>    columnDefinition.getTopSerializer(),
>>>    columnDefinition.getSubSerializer(),
>>>    inferringSerializer
>>>  )
>>> );
>>> 
>>> We have two frequent operation:
>>> 
>>> 1. count report rows by report_id (calculate number of super columns
>>> in the row).
>>> 2. get report rows by report_id and range predicate (get super columns
>>> from the row with range predicate).
>>> 
>>> I can't see here a big super columns :-(
>>> 
>>> On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs <tyler@datastax.com> wrote:
>>>> I'm not 100% that I understand your data model and read patterns correctly,
>>>> but it sounds like you have large supercolumns and are requesting some of
>>>> the subcolumns from individual super columns.  If that's the case, the issue
>>>> is that Cassandra must deserialize the entire supercolumn in memory whenever
>>>> you read *any* of the subcolumns.  This is one of the reasons why composite
>>>> columns are recommended over supercolumns.
>>>> 
>>>> 
>>>> On Thu, Sep 20, 2012 at 6:45 AM, Denis Gabaydulin <gabaden@gmail.com>
wrote:
>>>>> 
>>>>> p.s. Cassandra 1.1.4
>>>>> 
>>>>> On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin <gabaden@gmail.com>
>>>>> wrote:
>>>>>> Hi, all!
>>>>>> 
>>>>>> We have a cluster with virtual 7 nodes (disk storage is connected
to
>>>>>> nodes with iSCSI). The storage schema is:
>>>>>> 
>>>>>> Reports:{
>>>>>>    1:{
>>>>>>        1:{"value1":"some val", "value2":"some val"},
>>>>>>        2:{"value1":"some val", "value2":"some val"}
>>>>>>        ...
>>>>>>    },
>>>>>>    2:{
>>>>>>        1:{"value1":"some val", "value2":"some val"},
>>>>>>        2:{"value1":"some val", "value2":"some val"}
>>>>>>        ...
>>>>>>    }
>>>>>>    ...
>>>>>> }
>>>>>> 
>>>>>> create keyspace osmp_reports
>>>>>>  with placement_strategy = 'SimpleStrategy'
>>>>>>  and strategy_options = {replication_factor : 4}
>>>>>>  and durable_writes = true;
>>>>>> 
>>>>>> use osmp_reports;
>>>>>> 
>>>>>> create column family QueryReportResult
>>>>>>  with column_type = 'Super'
>>>>>>  and comparator = 'BytesType'
>>>>>>  and subcomparator = 'BytesType'
>>>>>>  and default_validation_class = 'BytesType'
>>>>>>  and key_validation_class = 'BytesType'
>>>>>>  and read_repair_chance = 1.0
>>>>>>  and dclocal_read_repair_chance = 0.0
>>>>>>  and gc_grace = 432000
>>>>>>  and min_compaction_threshold = 4
>>>>>>  and max_compaction_threshold = 32
>>>>>>  and replicate_on_write = true
>>>>>>  and compaction_strategy =
>>>>>> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>>>>>>  and caching = 'KEYS_ONLY';
>>>>>> 
>>>>>> =============================================
>>>>>> 
>>>>>> Read/Write CL: 2
>>>>>> 
>>>>>> Most of the reports are small, but some of them could have a half
>>>>>> mullion of rows (xml). Typical operations on this dataset is:
>>>>>> 
>>>>>> count report rows by report_id (top level id of super column);
>>>>>> get columns (report_rows) by range predicate and limit for given
>>>>>> report_id.
>>>>>> 
>>>>>> A data is written once and hasn't never been updated.
>>>>>> 
>>>>>> So, time to time a couple of nodes crashes with OOM exception. Heap
>>>>>> dump says, that we have a lot of super columns in memory.
>>>>>> For example, I see one of the reports is in memory entirely. How
it
>>>>>> could be possible? If we don't load the whole report, cassandra could
>>>>>> whether do this for some internal reasons?
>>>>>> 
>>>>>> What should we do to avoid OOMs?
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Tyler Hobbs
>>>> DataStax
>>>> 


Mime
View raw message