cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Lohfink (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (CASSANDRA-7242) More compaction visibility into thread pool and per CF
Date Fri, 16 May 2014 15:05:15 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Lohfink updated CASSANDRA-7242:
-------------------------------------

    Comment: was deleted

(was: patch for cassandra-2.0 branch)

> More compaction visibility into thread pool and per CF
> ------------------------------------------------------
>
>                 Key: CASSANDRA-7242
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7242
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chris Lohfink
>            Assignee: Chris Lohfink
>            Priority: Minor
>         Attachments: 7242_jmxify_compactionpool.txt, 7242_per_cf_compactionstats.txt
>
>
> Two parts to this to help diagnose compactions issues/bottlenecks.  Could be two different
issues but pretty closely related. 
> First is adding per column family pending compactions.  When theres a lot of backed up
compactions but multiple ones currently being compacted its hard to identify which CF is causing
the backlog.  In patch provided this doesnt cover the compactions in the thread pools queue
like compactionstats does but not sure how big that gets ever or if needs to be... which brings
me to the second idea.
> Second is to change compactionExecutor to extend the JMXEnabledThreadPoolExecutor.  Big
difference there would be the blocking rejection handler.  With a 2^31 pending queue the blocking
becoming an issue is a pretty extreme case in itself that would most likely OOM the server.
 So the different rejection policy shouldn't cause much of an issue but if it does can always
override it to use default behavior.  Would help identify scenarios where corrupted sstables
or unhandled exceptions etc killing the compactions lead to a large backlog with nothing actively
working.  Also just for added visibility into this from tpstats.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message