cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cyril Scetbon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7249) Too many threads associated with parallel compaction
Date Fri, 16 May 2014 17:30:16 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000021#comment-14000021
] 

Cyril Scetbon commented on CASSANDRA-7249:
------------------------------------------

This is already programmed for next week. Do you mean multithreaded compaction is buggy and
not recommended with 1.2.x ?

> Too many threads associated with parallel compaction
> ----------------------------------------------------
>
>                 Key: CASSANDRA-7249
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7249
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Ubuntu 12.04.3 LTS
> 24 CPUs (hyper threading enabled)
>            Reporter: Cyril Scetbon
>              Labels: compaction, parallel, threads
>
> We have a lot of threads on some nodes as you can see : 
> node001: 560
> node002: 529
> node003: 4350
> node004: 552
> node005: 547
> node006: 554
> node007: 572
> node008: 1444 <==
> node009: 540
> node010: 13691 <==
> node011: 577
> node012: 536
> node013: 448
> node014: 10295 <==
> node015: 452
> node016: 576
> When I check what are those threads I see a lot of "Deserializer sstables". 
> Enabling DEBUG mode shows that a lot of actions are about parallel compaction. What is
really surprising is that it tries to deserialize a huge number of times each sstable even
if we only have 8 files for the concerned column family :
>  512690 /data/ks1/cf1/ks1-cf1-ic-616-Data.db
>  296623 /data/ks1/cf1/ks1-cf1-ic-637-Data.db
>  311904 /data/ks1/cf1/ks1-cf1-ic-642-Data.db
>  127061 /data/ks1/cf1/ks1-cf1-ic-643-Data.db
>  126921 /data/ks1/cf1/ks1-cf1-ic-644-Data.db
>  129815 /data/ks1/cf1/ks1-cf1-ic-645-Data.db
>  127862 /data/ks1/cf1/ks1-cf1-ic-646-Data.db
>  317069 /data/ks1/cf1/ks1-cf1-ic-647-Data.db
> so, in a minute Cassandra execute 2 millions of times the following code :
> {code}
> else
> {
>   logger.debug("parallel eager deserialize from " + iter.getPath());
>   queue.put(new RowContainer(new Row(iter.getKey(),
>     iter.getColumnFamilyWithColumns(ArrayBackedSortedColumns.factory()))));
> }
> {code}
> It seems to be related to [CASSANDRA-5720|https://issues.apache.org/jira/browse/CASSANDRA-5720]
cause we got the same error on the concerned column families before the number of threads
raise. Upgrading to 2.0 is not a solution for now :(



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message