cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: Possible bug in Cassandra MapReduce
Date Sat, 19 Jun 2010 03:21:19 GMT
Fixed for 0.6.3:

On Fri, Jun 18, 2010 at 2:49 PM, Corey Hulen <> wrote:
> We are using MapReduce to periodical verify and rebuild our secondary
> indexes along with counting total records.  We started to noticed double
> counting of unique keys on single machine standalone tests. We were finally
> able to reproduce the problem using
> the apache-cassandra-0.6.2-src/contrib/word_count example and just
> re-running it multiple times.  We are hoping someone can verify the bug.
> re-run the tests and the word count for /tmp/word_count3/part-r-00000 will
> be 1000 +~200  and will change if you blow the data away and re-run.  Notice
> the setup script loops and only inserts 1000 records so we expect count to
> be 1000.  Once the data is generated then re-running the setup script and/or
> mapreduce doesn't change the number (still off).  The key is to blow all the
> data away and start over which will cause it to change.
> Can someone please verify this behavior?
> -Corey

Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support

View raw message