cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Schubert Zhang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1041) Skip large size (Configurable) SSTable in minor or/and major compaction
Date Tue, 04 May 2010 01:49:56 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863611#action_12863611
] 

Schubert Zhang commented on CASSANDRA-1041:
-------------------------------------------

@Jonathan,
Yes, we understand that major compactions will clean the tombstones. We add this configuration
based on following reasons:
(1) In same application, there is no delete operation or very seldom delete operation. 
(2) We still want to manually major compact old data to reduce the number of SSTables (but
need not a single big one).
(3) In some filesystem, too big file gains inefficacy.

For minor compactions, "minimumCompactionThreshold"  and "maximumCompactionThreshold" are
just the number of SSTables, but not size. We want use size threshold to avoid too many minor
compactions in background, which cost too many disk IO and CPU and memory.

The tow configurable options are optional for users.

@Stu,
Yes, I also prefer to create multiple SSTables (rather than 1) according to size threshold
(in our another project, it works well). But I think the "disjoint key ranges" is not so necessary
in cassandra.

> Skip large size (Configurable) SSTable in minor or/and major compaction
> -----------------------------------------------------------------------
>
>                 Key: CASSANDRA-1041
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1041
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Schubert Zhang
>            Priority: Minor
>         Attachments: CASSANDRA-1041-0.6.1.patch, CASSANDRA-1041-0.6.patch
>
>
> When the SSTable files are large enough, such as 100GB, the compaction (include minor
and major) cost is big (disk IO, CPU, memory), etc.
> In some applications, we accept not compcating all SSTables to the final very large ones.

> This feature provide two optional configurable attributes MinorCompactSkipInGB and MajorCompactSkipInGB
for each ColumnFamily. 
> The optional MinorCompactSkipInGB attribute specifies the maximum size of SSTables which
will be compcated in minor-compaction. The SSTables larger than MinorCompactSkipInGB will
be skipped. The optional MajorCompactSkipInGB attribute is same for major-compaction.
> The default of these attributes are 0, means do not skip, just as current 0.6.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message