Can you create a ticket? On Fri, Apr 30, 2010 at 4:55 PM, Joost Ouwerkerk wrote: > There's a bug in ColumnFamilyRecordReader that appears when processing > a single split.  When the start and end tokens of the split are equal, > duplicate rows can be returned. > > Example with 5 rows: > token (start and end) = 53193025635115934196771903670925341736 > > Tokens returned by first get_range_slices iteration: >  16955237001963240173058271559858726497 >  40670782773005619916245995581909898190 >  99079589977253916124855502156832923443 >  144992942750327304334463589818972416113 >  166860289390734216023086131251507064403 > > Tokens returned by next iteration (first token is last token from > previous, end token is unchanged) >  16955237001963240173058271559858726497 >  40670782773005619916245995581909898190 > > Tokens returned by final iteration  (first token is last token from > previous, end token is unchanged) >  [] (empty) > > In this example, the mapper has processed 7 rows in total, 2 of which > were duplicates. > > Joost. > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com