cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Lerer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10225) Make compression ratio much more accurate
Date Fri, 02 Oct 2015 20:14:29 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941681#comment-14941681
] 

Benjamin Lerer commented on CASSANDRA-10225:
--------------------------------------------

Computing the compression ratio by making the sum of the {{compressedFileLength}} and dividing
it by the sum of the {{dataLength}} does not look a bad approach to me but it seems that the
data length might not always be the real length (according to a comment in {{CompressionMetadata}}).

[~benedict] I am not too familiar with this part of the code. Is there a risk that computing
the compression ratio this way give us a wrong result?

> Make compression ratio much more accurate
> -----------------------------------------
>
>                 Key: CASSANDRA-10225
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10225
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Jeremy Hanna
>            Assignee: Brett Snyder
>              Labels: lhf
>             Fix For: 2.1.x
>
>         Attachments: cassandra-2.1-10225.txt
>
>
> Currently in cfstats, it will take an average over the compression ratios of all of the
sstables without regard to the data sizes.  This can lead to a very inaccurate value.  It
would be good to factor in the uncompressed and compressed sizes for the sstables to give
an accurate number.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message