cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuki Morishita (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9792) Reduce Merkle tree serialized size
Date Tue, 14 Jul 2015 22:39:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627189#comment-14627189
] 

Yuki Morishita commented on CASSANDRA-9792:
-------------------------------------------

Thanks for the patch.
I think it makes sense to use {{byte}} for hash length instead of {{int}}.
This changes serialization format, so I think this is better to go to trunk for 3.0 release.

I pushed your change to trunk to https://github.com/yukim/cassandra/tree/9792.
CI will eventually pick up for automated test, and if it looks good, I will commit the change.

> Reduce Merkle tree serialized size
> ----------------------------------
>
>                 Key: CASSANDRA-9792
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9792
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Bharatendra Boddu
>            Priority: Minor
>             Fix For: 3.x
>
>         Attachments: MerkleTree.java.patch
>
>
> This patch reduces the serialized size of a Merkle Tree by 10%.  With num_tokens set
to 256, 10% reduction in Merkle tree serialized size for each token range repair, improves
network bandwidth during repair 
> This table describes serialized sizes (in bytes) of Merkle trees with different depths
before and after patch. 
> Serialized size of a Merkle tree with certain depth, doesn't depend on number of keys
it represent.
> | Depth | Before patch | After patch |  Diff |
> |-------+--------------+-------------+-------|
> |     5 |         2060 |        1840 |   220 |
> |     6 |         4044 |        3600 |   444 |
> |     7 |         8012 |        7120 |   892 |
> |     8 |        15948 |       14160 |  1788 |
> |     9 |        31820 |       28240 |  3580 |
> |    10 |        63564 |       56400 |  7164 |
> |    11 |       127052 |      112720 | 14332 |
> |    12 |       254028 |      225360 | 28668 |
> |    13 |       507980 |      450640 | 57340 |
> Merkle tree with depth 15, uses serialized size of ~2MB and with this patch it will be
reduce the size by ~200KB. Repairing 256 token ranges will save ~50MB in transfer.
> Also if token serialize() method uses, byte type to represent a token size, then the
serialized size can be reduced by 30 to 40%.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message