incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Gabaydulin <gaba...@gmail.com>
Subject Re: [problem with OOM in nodes]
Date Thu, 20 Sep 2012 11:45:32 GMT
p.s. Cassandra 1.1.4

On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin <gabaden@gmail.com> wrote:
> Hi, all!
>
> We have a cluster with virtual 7 nodes (disk storage is connected to
> nodes with iSCSI). The storage schema is:
>
> Reports:{
>     1:{
>         1:{"value1":"some val", "value2":"some val"},
>         2:{"value1":"some val", "value2":"some val"}
>         ...
>     },
>     2:{
>         1:{"value1":"some val", "value2":"some val"},
>         2:{"value1":"some val", "value2":"some val"}
>         ...
>     }
>     ...
> }
>
> create keyspace osmp_reports
>   with placement_strategy = 'SimpleStrategy'
>   and strategy_options = {replication_factor : 4}
>   and durable_writes = true;
>
> use osmp_reports;
>
> create column family QueryReportResult
>   with column_type = 'Super'
>   and comparator = 'BytesType'
>   and subcomparator = 'BytesType'
>   and default_validation_class = 'BytesType'
>   and key_validation_class = 'BytesType'
>   and read_repair_chance = 1.0
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 432000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 32
>   and replicate_on_write = true
>   and compaction_strategy =
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>   and caching = 'KEYS_ONLY';
>
> =============================================
>
> Read/Write CL: 2
>
> Most of the reports are small, but some of them could have a half
> mullion of rows (xml). Typical operations on this dataset is:
>
> count report rows by report_id (top level id of super column);
> get columns (report_rows) by range predicate and limit for given report_id.
>
> A data is written once and hasn't never been updated.
>
> So, time to time a couple of nodes crashes with OOM exception. Heap
> dump says, that we have a lot of super columns in memory.
> For example, I see one of the reports is in memory entirely. How it
> could be possible? If we don't load the whole report, cassandra could
> whether do this for some internal reasons?
>
> What should we do to avoid OOMs?

Mime
View raw message