From user-return-31038-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Fri Jan 11 03:46:21 2013 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6B44BEF44 for ; Fri, 11 Jan 2013 03:46:21 +0000 (UTC) Received: (qmail 29321 invoked by uid 500); 11 Jan 2013 03:46:18 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 29171 invoked by uid 500); 11 Jan 2013 03:46:18 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 28483 invoked by uid 99); 11 Jan 2013 03:46:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Jan 2013 03:46:16 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a80.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Jan 2013 03:46:09 +0000 Received: from homiemail-a80.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a80.g.dreamhost.com (Postfix) with ESMTP id 2642937A06B for ; Thu, 10 Jan 2013 19:45:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=aLPqNxW9adLWhToqx0Hd9fhu3m I=; b=eTk2n0wKdIO6ngPYxjCz9CdtOJAZp/ZS1QJkm3EfMwlwwF1WaXpv6L07L8 V7xrlENjoswIjyDGPUiKdUkD7u0fGk58EGaLprW1iFjCtRtoxvQHP6+hLEI+OTwI V8zD/OeuSmgK9r7YRUiqXU02sKI/QqDlC/ShatndWKccutMDQ= Received: from [172.16.1.8] (unknown [203.86.207.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a80.g.dreamhost.com (Postfix) with ESMTPSA id 585BE37A065 for ; Thu, 10 Jan 2013 19:45:46 -0800 (PST) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_322FC91E-AF78-4F63-8720-508B075CBCC8" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: inconsistent hadoop/cassandra results Date: Fri, 11 Jan 2013 16:45:44 +1300 References: <6A5E739F-29B0-48DD-B518-E12245752206@thelastpickle.com> <51567E23-C212-481E-9A2B-0E5B7966C213@digitalenvoy.net> To: user@cassandra.apache.org In-Reply-To: <51567E23-C212-481E-9A2B-0E5B7966C213@digitalenvoy.net> X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_322FC91E-AF78-4F63-8720-508B075CBCC8 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii > But this is the first time I've tried to use the > wide-row support, which makes me a little suspicious. The wide-row = support is not > very well documented, so maybe I'm doing something wrong there in = ignorance. This was the area I was thinking about.=20 Can you drill in and see a pattern.=20 Are the differences in rows that would be paged by wide rows ? Could it be an off by one error in the wide row paging ?=20 It all sounds strange. So I would make sure what your job is outputing = matches what it is reading from C*. Maybe add some logging in there.=20 Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 10/01/2013, at 1:24 AM, Brian Jeltema = wrote: > Sorry if this is a duplicate - I was having mailer problems last = night: >=20 >> Assuming their were no further writes, running repair or using CL all = should have fixed it.=20 >>=20 >> Can you describe the inconsistency between runs?=20 >=20 > Sure. The job output is generated by a single reducer and consists of = a list of > key/value pairs where the key is the row key of the original table, = and the value is > the total count of all columns in the row. Each run produces a file = with a different > size, and running a diff against various output file pairs displays = rows that only > appear in one file, or rows with the same key but different counts.=20 >=20 > What seems particularly hard to explain is the behavior after setting = CL to ALL, > where the results eventually become reproducible (making it hard to = place the > blame on my trivial mapper/reducer implementations) but only after = about half a=20 > dozen runs. And once reaching this state, setting CL to QUORUM results = in=20 > additional inconsistent results. >=20 > I can say with certainty that there were no other writes. I'm the sole = developer working > with the CF in question. I haven't seen behavior like this before, = though I don't have > a tremendous amount of experience. But this is the first time I've = tried to use the > wide-row support, which makes me a little suspicious. The wide-row = support is not > very well documented, so maybe I'm doing something wrong there in = ignorance. >=20 > Brian >=20 >>=20 >> Cheers >>=20 >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> New Zealand >>=20 >> @aaronmorton >> http://www.thelastpickle.com >>=20 >> On 8/01/2013, at 2:16 AM, Brian Jeltema = wrote: >>=20 >>> I need some help understanding unexpected behavior I saw in some = recent experiments with Cassandra 1.1.5 and Hadoop 1.0.3: >>>=20 >>> I've written a small map/reduce job that simply counts the number of = columns in each row of a static CF (call it Foo)=20 >>> and generates a list of every row and column count. A relatively = small fraction of the rows have a large number >>> of columns; worst case is approximately 36 million. So when I set up = the job, I used wide-row support: >>>=20 >>> ConfigHelper.setInputColumnFamily(job.getConfiguration(), = "fooKS", "Foo", WIDE_ROWS); // where WIDE_ROWS =3D=3D true >>>=20 >>> When I ran this job using the default CL (1) I noticed that the = results varied from run to run, which I attributed to inconsistent >>> replicas, since Foo was generated with CL =3D=3D 1 and the RF =3D=3D = 3.=20 >>>=20 >>> So I ran repair for that CF on every node. The cassandra log on = every node contains lines similar to: >>>=20 >>> INFO [AntiEntropyStage:1] 2013-01-05 20:38:48,605 = AntiEntropyService.java (line 778) [repair = #e4a1d7f0-579d-11e2-0000-d64e0a75e6df] Foo is fully synced >>>=20 >>> However, repeated runs were still inconsistent. Then I set CL to = ALL, which I presumed would always result in identical >>> output, but repeated runs initially continued to be inconsistent. = However, I noticed that the results seemed to >>> be converging, and after several runs (somewhere between 4 and 6) I = finally was producing identical results on every run. >>> Then I set CL to QUORUM, and again generated inconsistent results. >>>=20 >>> Does this behavior make sense? >>>=20 >>> Brian >>=20 >=20 --Apple-Mail=_322FC91E-AF78-4F63-8720-508B075CBCC8 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii
http://www.thelastpickle.com

On 10/01/2013, at 1:24 AM, Brian Jeltema <brian.jeltema@digitalenvoy.= net> wrote:

Sorry if this is a = duplicate - I was having mailer problems last = night:

Assuming their were no = further writes, running repair or using CL all should have fixed = it. 

Can you describe the inconsistency between = runs? 

Sure. The = job output is generated by a single reducer and consists of a list = of
key/value pairs where the key is the row key of the = original table, and the value is
the total count of all = columns in the row. Each run produces a file with a = different
size, and running a diff against various output file = pairs displays rows that only
appear in one file, or rows with = the same key but different counts. 

What = seems particularly hard to explain is the behavior after setting CL to = ALL,
where the results eventually become reproducible (making = it hard to place the
blame on my trivial mapper/reducer = implementations) but only after about half a 
dozen runs. = And once reaching this state, setting CL to QUORUM results = in 
additional inconsistent = results.

I can say with certainty that there = were no other writes. I'm the sole developer working
with the = CF in question. I haven't seen behavior like this before, though I don't = have
a tremendous amount of experience. But this is the first = time I've tried to use the
wide-row support, which makes me a = little suspicious. The wide-row support is not
very well = documented, so maybe I'm doing something wrong there in = ignorance.

Brian

http://www.thelastpickle.com

On 8/01/2013, at 2:16 AM, Brian Jeltema <brian.jeltema@digitalenvoy.= net> wrote:

I need some help = understanding unexpected behavior I saw in some recent experiments with = Cassandra 1.1.5 and Hadoop 1.0.3:

I've written a = small map/reduce job that simply counts the number of columns in each = row of a static CF (call it Foo) 
and generates a list of every = row and column count. A relatively small fraction of the rows have a = large number
of columns; worst case is approximately 36 = million. So when I set up the job, I used wide-row = support:

    ConfigHelper.setInputColumnFamily(job.getConfiguration(), = "fooKS", "Foo", WIDE_ROWS); // where WIDE_ROWS =3D=3D = true
When I ran this job using the default CL (1) I noticed that = the results varied from run to run, which I attributed to = inconsistent
replicas, since Foo was generated with CL =3D=3D 1 and the RF = =3D=3D 3. 

So I ran repair for that CF on every node. The cassandra log = on every node contains lines similar to:

  INFO [AntiEntropyStage:1] = 2013-01-05 20:38:48,605 AntiEntropyService.java (line 778) [repair = #e4a1d7f0-579d-11e2-0000-d64e0a75e6df] Foo is fully = synced

However, repeated runs were still inconsistent. Then I set CL = to ALL, which I presumed would always result in = identical
output, but repeated runs initially continued to be = inconsistent. However, I noticed that the results seemed = to
be converging, and after several runs (somewhere between 4 and = 6) I finally was producing identical results on every = run.
Then I set CL to QUORUM, and again generated inconsistent = results.

Does this behavior make sense?