cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew F. Dennis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4894) log number of combined/merged rows during a compaction
Date Wed, 19 Dec 2012 10:09:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535835#comment-13535835
] 

Matthew F. Dennis commented on CASSANDRA-4894:
----------------------------------------------

I like the original bracketed format better.  The verbose option is hard to parse at a glance.
 likewise, dropping the zeros would make things more complicated to compare quickly.  the
bracketed syntax is easy to grep/script/compare/parse/etc.

maybe call it "compaction merge counts" instead of "merged row stats" though (so if we had
more stats later we don't have to come up with a different name) and then make sure the docs
are indexable for "compaction merge counts".  Also need to be clear in the docs that counts[0]
is just copied, counts[1] is merged two rows, etc ...
                
> log number of combined/merged rows during a compaction
> ------------------------------------------------------
>
>                 Key: CASSANDRA-4894
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4894
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Matthew F. Dennis
>            Assignee: Yuki Morishita
>            Priority: Minor
>             Fix For: 1.2.1
>
>         Attachments: 4894-1.2.txt
>
>
> we already log some details about compactions but it would be useful to know how many
rows were merged (resulting in "useful" work) and how many were unique (representing "wasted
work").
> the simple approach requires two additional counters (one for unique rows, one for merged
rows).  As the merge join is progressing if two or more rows are combined, tick the joined
counter.  If a row is simply copied tick the unique counter.
> a more complete solution would be to keep a separate count for each number of merges.
 This would require number_of_files_being_merged counters.  If no rows were merged, tick counters[0],
if two rows were merged tick counters[1] and so on 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message