cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8518) Impose In-Flight Data Limit
Date Sun, 01 Feb 2015 23:36:34 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14300786#comment-14300786
] 

Benedict commented on CASSANDRA-8518:
-------------------------------------

Hi Cheng,

The problem with the approach you suggest is that it could end up killing all queries on the
box if there is something else eating up heap (let's say compaction is a pain point). You're
right that it would be simpler, but it should be quite feasible to track an approximate count
of the heap used by a query - in fact we already have many of the necessary facilities to
do so, in order to impose limits on how much data can be stored in memtables. The difficulty
will be plugging this into each place we generate data for a query, which remains to be seen
how challenging it will be. Ideally we want to move to each class of operation on the system
having an isolated allotment of memory it's permitted to eat up, so that they don't each stomp
on the other.

> Impose In-Flight Data Limit
> ---------------------------
>
>                 Key: CASSANDRA-8518
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8518
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Cheng Ren
>              Labels: performance
>
> We have been suffering from cassandra node crash due to out of memory for a long time.
The heap dump from the recent crash shows there are 22 native transport request threads each
of which consumes 3.3% of heap size, taking more than 70% in total.  
> Heap dump:
> !https://dl-web.dropbox.com/get/attach1.png?_subject_uid=303980955&w=AAAVOoncBoZ5aOPbDg2TpRkUss7B-2wlrnhUAv19b27OUA|height=400,width=600!
> Expanded view of one thread:
> !https://dl-web.dropbox.com/get/Screen%20Shot%202014-12-18%20at%204.06.29%20PM.png?_subject_uid=303980955&w=AACUO4wrbxheRUxv8fwQ9P52T6gBOm5_g9zeIe8odu3V3w|height=400,width=600!
> The cassandra we are using now (2.0.4) utilized MemoryAwareThreadPoolExecutor as the
request executor and provided a default request size estimator which constantly returns 1,
meaning it limits only the number of requests being pushed to the pool. To have more fine-grained
control on handling requests and better protect our node from OOM issue, we propose implementing
a more precise estimator. 
> Here is our two cents:
> For update/delete/insert request: Size could be estimated by adding size of all class
members together.
> For scan query, the major part of the request is response, which can be estimated from
the history data. For example if we receive a scan query on a column family for a certain
token range, we keep track of its response size used as the estimated response size for later
scan query on the same cf. 
> For future requests on the same cf, response size could be calculated by token range*recorded
size/ recorded token range. The request size should be estimated as (query size + estimated
response size).
> We believe what we're proposing here can be useful for other people in the Cassandra
community as well. Would you mind providing us feedbacks? Please let us know if you have any
concerns or suggestions regarding this proposal.
> Thanks,
> Cheng



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message