cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piotr Kołaczkowski (JIRA) <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-4803) CFRR progress broken for wide row iterators
Date Thu, 18 Oct 2012 13:52:03 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479005#comment-13479005
] 

Piotr Kołaczkowski edited comment on CASSANDRA-4803 at 10/18/12 1:50 PM:
-------------------------------------------------------------------------

I attach a list of patches affecting operation of CFRR:

# Fix for obvious counting bug in wide row iterator that was counting columns instead of rows.
# Several fixes in describe_splits:
   fixed non-uniform splitting - caused by integer math roundoff errors
   fixed insane behaviour when number of splits was higher than number of key samples
   added estimated size of the split to the result, and make use of it in CFIF
# This is a patch for broken get_paged_slice; addressed in a separate ticket, but I had to
include it in order to test my code
# Fix for creating excessively small splits (and wrong progress reporting) due to range wrap
around.
# get_range_slices allows for (start_key, end_token) exactly the same as get_paged_slice 
# I tried to de-spaghettize CFRR code a little. This also fixes some bug that accidentally
slipped in with previous patches.
                
      was (Author: pkolaczk):
    I attach a list of patches affecting operation of CFRR:

# Fix for obvious counting bug in wide row iterator that was counting columns instead of rows.

# Several fixes in describe_splits:
  - fixed non-uniform splitting - caused by integer math roundoff errors
  - fixed insane behaviour when number of splits was higher than number of key samples
  - added estimated size of the split to the result, and make use of it in CFIF

# This is a patch for broken get_paged_slice; addressed in a separate ticket, but I had to
include it in order to test my code

# Fix for creating excessively small splits (and wrong progress reporting) due to range wrap
around.

# get_range_slices allows for (start_key, end_token) exactly the same as get_paged_slice 

# I tried to de-spaghettize CFRR code a little. This also fixes some bug that accidentally
slipped in with previous patches.
                  
> CFRR progress broken for wide row iterators
> -------------------------------------------
>
>                 Key: CASSANDRA-4803
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4803
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1.5
>            Reporter: Piotr Kołaczkowski
>            Assignee: Piotr Kołaczkowski
>         Attachments: 0001-Wide-row-iterator-counts-rows-not-columns.patch, 0002-Fixed-bugs-in-describe_splits.-CFRR-uses-row-counts-.patch,
0003-Fixed-get_paged_slice-memtable-and-sstable-column-it.patch, 0004-Better-token-range-wrap-around-handling-in-CFIF-CFRR.patch,
0005-Fixed-handling-of-start_key-end_token-in-get_range_s.patch, 0006-Code-cleanup-refactoring-in-CFRR.-Fixed-bug-with-mis.patch
>
>
> {code}
>  public float getProgress()
>     {
>         // TODO this is totally broken for wide rows
>         // the progress is likely to be reported slightly off the actual but close enough
>         float progress = ((float) iter.rowsRead() / totalRowCount);
>         return progress > 1.0F ? 1.0F : progress;
>     }
> {code}
> The problem is iter.rowsRead() does not return the number of rows read from the wide
row iterator, but returns number of *columns* (every row is counted multiple times). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message