cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Eriksson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8301) Create a tool that given a bunch of sstables creates a "decent" sstable leveling
Date Wed, 26 Nov 2014 19:44:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226708#comment-14226708
] 

Marcus Eriksson commented on CASSANDRA-8301:
--------------------------------------------

cool, what is your heuristic for finding the level?

I thought a bit about it and figured that we could probably estimate level by ordering sstables
by the number of other sstables they overlap, then putting the ones that overlap the most
in the lowest levels

ie, an sstable in L1 is bound to overlap ~10 in L2, 100 in L3 etc, meaning it would overlap
110 sstables if we only have 3 levels, an sstable in L2 would overlap 10 in L3 and only one
in L1, total 11, and sstables in the top level would only overlap one in L2 and one in L1.
This assumes L0 was empty when bootstrapping which is most often wrong and I haven't given
much thought on how to fix that

> Create a tool that given a bunch of sstables creates a "decent" sstable leveling
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8301
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Eriksson
>
> In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new node, you
will end up with a ton of files in L0 and it might be extremely painful to get LCS to compact
into a new leveling
> We could probably exploit the fact that we have many non-overlapping sstables in L0,
and offline-bump those sstables into higher levels. It does not need to be perfect, just get
the majority of the data into L1+ without creating overlaps.
> So, suggestion is to create an offline tool that looks at the range each sstable covers
and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message