cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5722) Cleanup should skip sstables that don't contain data outside a nodes ranges
Date Tue, 16 Jul 2013 18:48:49 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13710074#comment-13710074
] 

Jonathan Ellis commented on CASSANDRA-5722:
-------------------------------------------

bq. is the cost of decorating index keys so high that it outweighs the savings from exiting
the loop earlier when a greater key is found?

That's exactly what we want to do, and that's what the {{indexDecoratedKey.compareTo(position)
> 0}} check does for us.  The part I removed allows us to skip this check when we don't
find key greater than the position before we've finished the block where such a key would
exist, i.e., it saves us exactly one iteration of the loop [since the first key out of the
next block is guaranteed to be greater].

I thought that fell into the category of premature optimization and took it out so it was
more clear what we're doing.  Did I miss something?
                
> Cleanup should skip sstables that don't contain data outside a nodes ranges
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5722
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5722
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Nick Bailey
>            Assignee: Tyler Hobbs
>             Fix For: 2.0.1
>
>         Attachments: 0001-Skip-cleanup-when-unneeded.patch
>
>
> Right now cleanup is optimized to simply delete sstables that *only* contain data that
doesn't belong on the node, for all other sstables though, it will read them, check each row,
and write out new sstables.
> Cleanup could be optimized to look at an sstable and determine that all data within the
sstable does belong on a node, and therefore skip re-writing that sstable. This would make
cleanup essentially a noop in the case where all data on a node belongs on that node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message