cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-1610) Pluggable Compaction
Date Fri, 13 May 2011 02:09:47 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032796#comment-13032796
] 

Stu Hood commented on CASSANDRA-1610:
-------------------------------------

* Have the AbstractCompactionStrategy class return the default strategy for use in CFMetaData
* createCompactionStrategyInstance should use FBUtilities.construct(class, readable)
* Unnecessary method renames in CFMetaData
* CompactionStrategy instantiation in DatabaseDescriptor duplicates the instantiation in CFMetaData:
see what could be put into FBUtilities
* Whitespace changes in db.ColumnFamily
* Unnecessary ByteBufferUtil import in ColumnIndexer
* Are you sure we can remove the major compaction file size threshold?
* Need to special case the 'expired' directory in SSTable.tryComponentFromFilename
* handleInsufficientSpaceForCompaction should move inside getBuckets (as mentioned in your
TODOs): it would be best if the strategy logged at info/warn for files that don't fit into
a bucket that matches the parameters
* re: the TODO in doExpireCompaction: For correctness' sake, we'll need to invalidate row
cache entries that match the expired files, but I would be fine doing that in a separate ticket,
because it'll be a little bit involved
* Try to remove TODOs that are speculative: if there are tasks that are blockers for this
ticket, list them here. If they aren't blockers for this ticket, but are worthy tasks, they
should be moved into tickets before this is committed
* Please parse the options for TimestampBucketedCompactionStrategy in the constructor
* One or two comments explaining the bucketing strategy for TimestampBucketed.getBuckets would
be helpful
* Methods that are public only for testing should be package protected (cf. getBuckets)
* Seconds would make a better unit for expiration than days
* See if you can find a way to remove some of the duplication between selectFor(Minor|Major)
* The AbstractCompactedRow sstableStats reference should move into SSTableWriter... to collect
information about a row as it is appended to the writer, you'll probably want to pass it to
AbstractCompactedRow.write(file, <stats>). There is an example approach on 2319
* useOldStatsFile should be descriptive
* Rename SSTableStats to SSTableMetadata
* Unnecessary imports in ByteBufferUtil
* Regarding the disabled test in DatabaseDescriptorTest: it's probably because Maps of CharSequence
will not equal one another if one contains UTF8s and the other contains Strings: see if we
have another round trip test, and then consider removing that one. It's not the first time
its come up

Awesome work Alan!

> Pluggable Compaction
> --------------------
>
>                 Key: CASSANDRA-1610
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1610
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Alan Liang
>            Priority: Minor
>              Labels: compaction
>             Fix For: 1.0
>
>         Attachments: 0001-move-compaction-code-into-own-package.patch, 0002-Pluggable-Compaction-and-Expiration.patch
>
>
> In CASSANDRA-1608, I proposed some changes on how compaction works. I think it also makes
sense to allow the ability to have pluggable compaction per CF. There could be many types
of workloads where this makes sense. One example we had at Digg was to completely throw away
certain SSTables after N days. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message