cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremy Hanna (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (CASSANDRA-1042) ColumnFamilyRecordReader returns duplicate rows
Date Thu, 27 May 2010 16:38:44 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872280#action_12872280
] 

Jeremy Hanna edited comment on CASSANDRA-1042 at 5/27/10 12:38 PM:
-------------------------------------------------------------------

Also - I didn't remove StorageService.getStringEndpointMap in the 0.6 branch version because
CassandraServer.get_string_property still calls it.  get_string_property was removed on trunk
as part of CASSANDRA-965

      was (Author: jeromatron):
    Also - I didn't remove StorageService.getStringEndpointMap in the 0.6 branch version because
CassandraServer.get_string_property still calls it.  It was removed on trunk as part of CASSANDRA-965
  
> ColumnFamilyRecordReader returns duplicate rows
> -----------------------------------------------
>
>                 Key: CASSANDRA-1042
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1042
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.6
>            Reporter: Joost Ouwerkerk
>            Assignee: Jonathan Ellis
>             Fix For: 0.6.3
>
>         Attachments: Cassandra-1042-0_6-branch.patch.txt, CASSANDRA-1042-trunk.patch.txt,
cassandra.tar.gz
>
>
> There's a bug in ColumnFamilyRecordReader that appears when processing a single split
(which happens in most tests that have small number of rows), and potentially in other cases.
 When the start and end tokens of the split are equal, duplicate rows can be returned.
> Example with 5 rows:
> token (start and end) = 53193025635115934196771903670925341736
> Tokens returned by first get_range_slices iteration (all 5 rows):
>  16955237001963240173058271559858726497
>  40670782773005619916245995581909898190
>  99079589977253916124855502156832923443
>  144992942750327304334463589818972416113
>  166860289390734216023086131251507064403
> Tokens returned by next iteration (first token is last token from
> previous, end token is unchanged)
>  16955237001963240173058271559858726497
>  40670782773005619916245995581909898190
> Tokens returned by final iteration  (first token is last token from
> previous, end token is unchanged)
>  [] (empty)
> In this example, the mapper has processed 7 rows in total, 2 of which
> were duplicates.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message