cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Olsson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11390) Too big MerkleTrees allocated during repair
Date Tue, 22 Mar 2016 16:27:25 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206714#comment-15206714
] 

Marcus Olsson commented on CASSANDRA-11390:
-------------------------------------------

bq. I imagine this is what was always intended - perhaps we should open a new ticket to investigate
if we should increase it
It would probably be good to test if it's a reasonable limit, but it might not have that high
priority unless we see lots of over-streaming from the current one.

bq. Note that we don't care about the ranges when we calculate this, so we have to assume
that gain within a range is the same as the total gain. Biggest problem is how to test this,
will try to figure something out.
If it gets too complex to the test it might not be worth to have the compaction gain as part
of the calculation. It would most probably reduce the MerkleTrees sizes, which is good unless
the compaction gain comes from data that is not part of the repair. Capping the MerkleTrees
total size might be good enough alone since the only thing the duplicate partitions should
bring is unnecessarily large resolution, not the memory problems. It could possibly be a separate
ticket to investigate if there would be a gain from using the compaction gain in the calculation.

---

For:
{code}
logger.trace("Created {} merkle trees, {} partitions, {} bytes", tree.size(), allPartitions,
MerkleTrees.serializer.serializedSize(tree, 0));
{code}
The {{MerkleTrees.size()}} method returns the combined value from calling {{MerkleTree.size()}}
on all MerkleTrees, which returns {{2^d}}. To get the number of merkle trees we could either
create a new method in {{MerkleTrees}} (treeCount()?) or use {{MerkleTrees.ranges().size()}}.
It could probably be good to have both the number of trees as well as the output from {{MerkleTrees.size()}}
in the log output.

Other than that LGTM. :)

> Too big MerkleTrees allocated during repair
> -------------------------------------------
>
>                 Key: CASSANDRA-11390
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11390
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>             Fix For: 3.0.x, 3.x
>
>
> Since CASSANDRA-5220 we create one merkle tree per range, but each of those trees is
allocated to hold all the keys on the node, taking up too much memory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message