Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 15FF2F8AE for ; Wed, 10 Apr 2013 23:30:02 +0000 (UTC) Received: (qmail 77288 invoked by uid 500); 10 Apr 2013 23:29:59 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 77266 invoked by uid 500); 10 Apr 2013 23:29:59 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 77257 invoked by uid 99); 10 Apr 2013 23:29:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Apr 2013 23:29:59 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.128.52] (HELO mail-qe0-f52.google.com) (209.85.128.52) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Apr 2013 23:29:52 +0000 Received: by mail-qe0-f52.google.com with SMTP id jy17so595258qeb.11 for ; Wed, 10 Apr 2013 16:29:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=uI0CdyH5bG7pLxOSfIolufoBwUh3rTpRlYUrJh6uCQs=; b=W3QO+s/58/a2oaN5ALXaf3f8iSQAz5Rf18uncvts+WDh3jgrI6eG6ZC0Bbd5M676Nj 6/GncpMJBgVFZh/UP8Zdaa/4L+dEOH43V+0JJm1SYOZMSKdV5SRehquMSVigXAXX9l6V jV1i0C7ZQr+josMDJFXwsl9T8Xk1cPUnaZdH5LnoICaczqhZc1zKrtpjBycHpgF6wpVV xoDxG+gFbsViePIx2l1GaYcIvXiC6Nx0EZJSVSMUogfdAciKaJ+Y+YIk7bJlpaojGyju ThUoA267UCBMsiyEL3pv7hOwsDSti+I0BDM+bPnwLHCCct3QWhdjt18dxdZLRPIqQX9u Cptw== X-Received: by 10.229.17.149 with SMTP id s21mr1588121qca.15.1365636571175; Wed, 10 Apr 2013 16:29:31 -0700 (PDT) Received: from [192.168.1.4] (c-98-198-192-29.hsd1.tx.comcast.net. [98.198.192.29]) by mx.google.com with ESMTPS id c3sm3160917qed.1.2013.04.10.16.29.29 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 10 Apr 2013 16:29:30 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) Subject: Re: Thrift message length exceeded From: Lanny Ripple In-Reply-To: <84B566FB5B7B244B81E6F1FEADA9087701D5EAE827@LDNPCMMGMB01.INTRANET.BARCAPINT.COM> Date: Wed, 10 Apr 2013 18:29:28 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <7DAD2865-3741-4756-A37C-507A90333371@spotright.com> <84B566FB5B7B244B81E6F1FEADA9087701D5EAE827@LDNPCMMGMB01.INTRANET.BARCAPINT.COM> To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1503) X-Gm-Message-State: ALoCoQluo0iFDySe6q+3jp91pseG+Yn1L21bcBUlA+XITVYoW+aVCfUJxnJMEZP1Fw5IggcqJWF8 X-Virus-Checked: Checked by ClamAV on apache.org We are using Astyanax in production but I cut back to just Hadoop and = Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem. We do have some extremely large rows but we went from everything working = with 1.1.5 to almost everything carping with 1.2.3. Something has = changed. Perhaps we were doing something wrong earlier that 1.2.3 = exposed but surprises are never welcome in production. On Apr 10, 2013, at 8:10 AM, wrote: > I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector = 0.6 to 0.8 > Turns out the Thrift message really was too long. > The mystery to me: Why no complaints in previous versions? Were some = checks added in Thrift or Hector? >=20 > -----Original Message----- > From: Lanny Ripple [mailto:lanny@spotright.com]=20 > Sent: Tuesday, April 09, 2013 6:17 PM > To: user@cassandra.apache.org > Subject: Thrift message length exceeded >=20 > Hello, >=20 > We have recently upgraded to Cass 1.2.3 from Cass 1.1.5. We ran = sstableupgrades and got the ring on its feet and we are now seeing a new = issue. >=20 > When we run MapReduce jobs against practically any table we find the = following errors: >=20 > 2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader: = Loaded the native-hadoop library > 2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: = Initializing JVM Metrics with processName=3DMAP, sessionId=3D > 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: = setsid exited with exit code 0 > 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task: Using = ResourceCalculatorPlugin : = org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5 > 2013-04-09 09:58:50,475 INFO = org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater = with mapRetainSize=3D-1 and reduceRetainSize=3D-1 > 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error = running child > java.lang.RuntimeException: org.apache.thrift.TException: Message = length exceeded: 106 > at = org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.may= beInit(ColumnFamilyRecordReader.java:384) > at = org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.com= puteNext(ColumnFamilyRecordReader.java:390) > at = org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.com= puteNext(ColumnFamilyRecordReader.java:313) > at = com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterat= or.java:143) > at = com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:1= 38) > at = org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFam= ilyRecordReader.java:103) > at = org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTa= sk.java:444) > at = org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapT= ask.java:460) > at = org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at = org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) > at org.apache.hadoop.mapred.Child$4.run(Child.java:266) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at = org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.= java:1278) > at org.apache.hadoop.mapred.Child.main(Child.java:260) > Caused by: org.apache.thrift.TException: Message length exceeded: 106 > at = org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol= .java:393) > at = org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java= :363) > at org.apache.cassandra.thrift.Column.read(Column.java:528) > at = org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.j= ava:507) > at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408) > at = org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassand= ra.java:12905) > at = org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) > at = org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassand= ra.java:734) > at = org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.ja= va:718) > at = org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.may= beInit(ColumnFamilyRecordReader.java:346) > ... 16 more > 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning = cleanup for the task >=20 > The message length listed on each failed job differs (not always 106). = Jobs that used to run fine now fail with code compiled against cass = 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3 = servers in production). I'm using the following setup to configure the = job: >=20 > def cassConfig(job: Job) { > val conf =3D job.getConfiguration() >=20 > ConfigHelper.setInputRpcPort(conf, "" + 9160) > ConfigHelper.setInputInitialAddress(conf, Config.hostip) >=20 > ConfigHelper.setInputPartitioner(conf, = "org.apache.cassandra.dht.RandomPartitioner") > ConfigHelper.setInputColumnFamily(conf, Config.keyspace, = Config.cfname) >=20 > val pred =3D { > val range =3D new SliceRange() > .setStart("".getBytes("UTF-8")) > .setFinish("".getBytes("UTF-8")) > .setReversed(false) > .setCount(4096 * 1000) >=20 > new SlicePredicate().setSlice_range(range) > } >=20 > ConfigHelper.setInputSlicePredicate(conf, pred) > } >=20 > The job consists only of a mapper that increments counters for each = row and associated columns so all I'm really doing is exercising = ColumnFamilyRecordReader. >=20 > Has anyone else seen this? Is there a workaround/fix to get our jobs = running? >=20 > Thanks > _______________________________________________ >=20 > This message may contain information that is confidential or = privileged. If you are not an intended recipient of this message, please = delete it and any attachments, and notify the sender that you have = received it in error. Unless specifically stated in the message or = otherwise indicated, you may not duplicate, redistribute or forward this = message or any portion thereof, including any attachments, by any means = to any other person, including any retail investor or customer. This = message is not a recommendation, advice, offer or solicitation, to = buy/sell any product or service, and is not an official confirmation of = any transaction. Any opinions presented are solely those of the author = and do not necessarily represent those of Barclays. >=20 > This message is subject to terms available at: = www.barclays.com/emaildisclaimer and, if received from Barclays' Sales = or Trading desk, the terms available at: = www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays = you consent to the foregoing. Barclays Bank PLC is a company registered = in England (number 1026167) with its registered office at 1 Churchill = Place, London, E14 5HP. This email may relate to or be sent from other = members of the Barclays group. >=20 > _______________________________________________