cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Jeltema <>
Subject Re: inconsistent hadoop/cassandra results
Date Wed, 09 Jan 2013 12:24:16 GMT
Sorry if this is a duplicate - I was having mailer problems last night:

> Assuming their were no further writes, running repair or using CL all should have fixed
> Can you describe the inconsistency between runs? 

Sure. The job output is generated by a single reducer and consists of a list of
key/value pairs where the key is the row key of the original table, and the value is
the total count of all columns in the row. Each run produces a file with a different
size, and running a diff against various output file pairs displays rows that only
appear in one file, or rows with the same key but different counts. 

What seems particularly hard to explain is the behavior after setting CL to ALL,
where the results eventually become reproducible (making it hard to place the
blame on my trivial mapper/reducer implementations) but only after about half a 
dozen runs. And once reaching this state, setting CL to QUORUM results in 
additional inconsistent results.

I can say with certainty that there were no other writes. I'm the sole developer working
with the CF in question. I haven't seen behavior like this before, though I don't have
a tremendous amount of experience. But this is the first time I've tried to use the
wide-row support, which makes me a little suspicious. The wide-row support is not
very well documented, so maybe I'm doing something wrong there in ignorance.


> Cheers
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> @aaronmorton
> On 8/01/2013, at 2:16 AM, Brian Jeltema <> wrote:
>> I need some help understanding unexpected behavior I saw in some recent experiments
with Cassandra 1.1.5 and Hadoop 1.0.3:
>> I've written a small map/reduce job that simply counts the number of columns in each
row of a static CF (call it Foo) 
>> and generates a list of every row and column count. A relatively small fraction of
the rows have a large number
>> of columns; worst case is approximately 36 million. So when I set up the job, I used
wide-row support:
>>     ConfigHelper.setInputColumnFamily(job.getConfiguration(), "fooKS", "Foo", WIDE_ROWS);
// where WIDE_ROWS == true
>> When I ran this job using the default CL (1) I noticed that the results varied from
run to run, which I attributed to inconsistent
>> replicas, since Foo was generated with CL == 1 and the RF == 3. 
>> So I ran repair for that CF on every node. The cassandra log on every node contains
lines similar to:
>>   INFO [AntiEntropyStage:1] 2013-01-05 20:38:48,605 (line
778) [repair #e4a1d7f0-579d-11e2-0000-d64e0a75e6df] Foo is fully synced
>> However, repeated runs were still inconsistent. Then I set CL to ALL, which I presumed
would always result in identical
>> output, but repeated runs initially continued to be inconsistent. However, I noticed
that the results seemed to
>> be converging, and after several runs (somewhere between 4 and 6) I finally was producing
identical results on every run.
>> Then I set CL to QUORUM, and again generated inconsistent results.
>> Does this behavior make sense?
>> Brian

View raw message