cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3532) Compaction cleanupIfNecessary costly when many files in data dir
Date Fri, 02 Dec 2011 18:47:39 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161782#comment-13161782
] 

Sylvain Lebresne commented on CASSANDRA-3532:
---------------------------------------------

* Wrapping exceptions into RuntimeException() blindly will confuse the catcher of UserInterruptedException
in DTPE.logExceptionsAfterExecute. We could make make that catcher unwrap the full exception,
but truth is I'm not a fan of wrapping exception needlessly. Maybe we could just add a silence
method somewhere:
{noformat}
public void silence(Exception e)
{
    if (e instanceof RuntimeException)
        throw (RuntimeException) e;
    else
        throw new RuntimeException(e);
}
{noformat}
and use that (as we already do in WrappedRunnable actually).
* We could remove the bitmap indexes type while were at it (it'd be one less type to check).

                
> Compaction cleanupIfNecessary costly when many files in data dir
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-3532
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3532
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.0
>         Environment: Solaris 10, 1.0.4 release candidate
>            Reporter: Eric Parusel
>            Assignee: Eric Parusel
>              Labels: compaction
>             Fix For: 1.0.6
>
>         Attachments: 3532-v2.txt, 3532.txt
>
>
> From what I can tell SSTableWriter.cleanupIfNecessary seems increasingly costly as the
number of files in the data dir increases.
> It calls SSTable.componentsFor(descriptor, Descriptor.TempState.TEMP) which lists all
files in the data dir to find matching components.
> Am I roughly correct that   (cleanupCost = SSTable count * data dir size)?
> We had been doing write load testing with default compaction throttling (16MB/s) and
LeveledCompaction.
> Unfortunately we haven't been keeping tabs on sstable counts and it grew out of control.
> On a system with 300,000 sstables (!) here is an example of our compaction rate.  Note
that as you're probably aware cleanupIfNecessary is included in the timing:
>  INFO [CompactionExecutor:48] 2011-11-25 22:25:30,353 CompactionTask.java (line 213)
Compacted to [/data1/cassandra/data/MA_DDR/indexes_03-hc-5369-Data.db,].  5,821,590 to 5,306,354
(~91% of original) bytes for 123 keys at 0.163755MB/s.  Time: 30,903ms.
> Here's a slightly larger one:
>  INFO [CompactionExecutor:43] 2011-11-25 22:23:28,956 CompactionTask.java (line 213)
Compacted to [/data1/cassandra/data/MA_DDR/indexes_03-hc-5336-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5337-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5338-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5339-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5340-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5341-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5342-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5343-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5344-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5345-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5346-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5347-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5348-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5349-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5350-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5351-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5352-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5353-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5354-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5355-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5356-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5357-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5358-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5359-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5360-Data.db,/data1/cassandra/data/MA_DDR/indexes_03-hc-5361-Data.db,].
 140,706,512 to 137,990,868 (~98% of original) bytes for 2,181 keys at 0.338627MB/s.  Time:
388,623ms.
> This is with compaction throttling set to 0 (Off).
> So I believe because of this it's going to take a very long time to recover from having
so many small sstables. 
> It might be notable that we're using Solaris 10, possibly listFiles() is faster on other
platforms?
> Is it feasible to keep track of the temp files and just delete them rather than searching
for them for each SSTable using SSTable.componentsFor()?
> Here's the stack trace for the CompactionExecutor:14 thread that appears to be occupying
the majority of the cpu time on this node:
> Name: CompactionExecutor:14
> State: RUNNABLE
> Total blocked: 3  Total waited: 1,610,714
> Stack trace: 
>  java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
> java.io.UnixFileSystem.getBooleanAttributes(Unknown Source)
> java.io.File.isDirectory(Unknown Source)
> org.apache.cassandra.io.sstable.SSTable$3.accept(SSTable.java:204)
> java.io.File.listFiles(Unknown Source)
> org.apache.cassandra.io.sstable.SSTable.componentsFor(SSTable.java:200)
> org.apache.cassandra.io.sstable.SSTableWriter.cleanupIfNecessary(SSTableWriter.java:289)
> org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:189)
> org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:57)
> org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:134)
> org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:114)
> java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
> java.util.concurrent.FutureTask.run(Unknown Source)
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> java.lang.Thread.run(Unknown Source)
> No matter where I click in the busy Compaction thread timeline in YourKit it's in Running
state and showing this above trace, except for short periods of time where it's actually compacting
:)
> Thanks,
> Eric

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message