incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Possible bug in Cassandra MapReduce
Date Sat, 19 Jun 2010 03:21:19 GMT
Fixed for 0.6.3: https://issues.apache.org/jira/browse/CASSANDRA-1042

On Fri, Jun 18, 2010 at 2:49 PM, Corey Hulen <cj@earnstone.com> wrote:
>
> We are using MapReduce to periodical verify and rebuild our secondary
> indexes along with counting total records.  We started to noticed double
> counting of unique keys on single machine standalone tests. We were finally
> able to reproduce the problem using
> the apache-cassandra-0.6.2-src/contrib/word_count example and just
> re-running it multiple times.  We are hoping someone can verify the bug.
> re-run the tests and the word count for /tmp/word_count3/part-r-00000 will
> be 1000 +~200  and will change if you blow the data away and re-run.  Notice
> the setup script loops and only inserts 1000 records so we expect count to
> be 1000.  Once the data is generated then re-running the setup script and/or
> mapreduce doesn't change the number (still off).  The key is to blow all the
> data away and start over which will cause it to change.
> Can someone please verify this behavior?
> -Corey



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message