incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lanny Ripple <la...@spotright.com>
Subject Re: Thrift message length exceeded
Date Mon, 15 Apr 2013 22:17:38 GMT
A bump to say I found this

  http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded

so others are seeing similar behavior.

From what I can see of org.apache.cassandra.hadoop nothing has changed since 1.1.5 when we
didn't see such things but sure looks like there's a bug that's slipped in (or been uncovered)
somewhere.  I'll try to narrow down to a dataset and code that can reproduce.

On Apr 10, 2013, at 6:29 PM, Lanny Ripple <lanny@spotright.com> wrote:

> We are using Astyanax in production but I cut back to just Hadoop and Cassandra to confirm
it's a Cassandra (or our use of Cassandra) problem.
> 
> We do have some extremely large rows but we went from everything working with 1.1.5 to
almost everything carping with 1.2.3.  Something has changed.  Perhaps we were doing something
wrong earlier that 1.2.3 exposed but surprises are never welcome in production.
> 
> On Apr 10, 2013, at 8:10 AM, <moshe.kranc@barclays.com> wrote:
> 
>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector 0.6 to 0.8
>> Turns out the Thrift message really was too long.
>> The mystery to me: Why no complaints in previous versions? Were some checks added
in Thrift or Hector?
>> 
>> -----Original Message-----
>> From: Lanny Ripple [mailto:lanny@spotright.com] 
>> Sent: Tuesday, April 09, 2013 6:17 PM
>> To: user@cassandra.apache.org
>> Subject: Thrift message length exceeded
>> 
>> Hello,
>> 
>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran sstableupgrades
and got the ring on its feet and we are now seeing a new issue.
>> 
>> When we run MapReduce jobs against practically any table we find the following errors:
>> 
>> 2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the
native-hadoop library
>> 2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing
JVM Metrics with processName=MAP, sessionId=
>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: setsid exited with
exit code 0
>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin
: org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
>> 2013-04-09 09:58:50,475 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing
logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error running child
>> java.lang.RuntimeException: org.apache.thrift.TException: Message length exceeded:
106
>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
>> 	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>> 	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:396)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>> 	at org.apache.hadoop.mapred.Child.main(Child.java:260)
>> Caused by: org.apache.thrift.TException: Message length exceeded: 106
>> 	at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>> 	at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>> 	at org.apache.cassandra.thrift.Column.read(Column.java:528)
>> 	at org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
>> 	at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>> 	at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
>> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>> 	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
>> 	at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
>> 	... 16 more
>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for
the task
>> 
>> The message length listed on each failed job differs (not always 106).  Jobs that
used to run fine now fail with code compiled against cass 1.2.3 (and work fine if compiled
against 1.1.5 and run against the 1.2.3 servers in production).  I'm using the following setup
to configure the job:
>> 
>> def cassConfig(job: Job) {
>>   val conf = job.getConfiguration()
>> 
>>   ConfigHelper.setInputRpcPort(conf, "" + 9160)
>>   ConfigHelper.setInputInitialAddress(conf, Config.hostip)
>> 
>>   ConfigHelper.setInputPartitioner(conf, "org.apache.cassandra.dht.RandomPartitioner")
>>   ConfigHelper.setInputColumnFamily(conf, Config.keyspace, Config.cfname)
>> 
>>   val pred = {
>>     val range = new SliceRange()
>>       .setStart("".getBytes("UTF-8"))
>>       .setFinish("".getBytes("UTF-8"))
>>       .setReversed(false)
>>       .setCount(4096 * 1000)
>> 
>>     new SlicePredicate().setSlice_range(range)
>>   }
>> 
>>   ConfigHelper.setInputSlicePredicate(conf, pred)
>> }
>> 
>> The job consists only of a mapper that increments counters for each row and associated
columns so all I'm really doing is exercising ColumnFamilyRecordReader.
>> 
>> Has anyone else seen this?  Is there a workaround/fix to get our jobs running?
>> 
>> Thanks
>> _______________________________________________
>> 
>> This message may contain information that is confidential or privileged. If you are
not an intended recipient of this message, please delete it and any attachments, and notify
the sender that you have received it in error. Unless specifically stated in the message or
otherwise indicated, you may not duplicate, redistribute or forward this message or any portion
thereof, including any attachments, by any means to any other person, including any retail
investor or customer. This message is not a recommendation, advice, offer or solicitation,
to buy/sell any product or service, and is not an official confirmation of any transaction.
Any opinions presented are solely those of the author and do not necessarily represent those
of Barclays.
>> 
>> This message is subject to terms available at: www.barclays.com/emaildisclaimer and,
if received from Barclays' Sales or Trading desk, the terms available at: www.barclays.com/salesandtradingdisclaimer/.
By messaging with Barclays you consent to the foregoing. Barclays Bank PLC is a company registered
in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP.
This email may relate to or be sent from other members of the Barclays group.
>> 
>> _______________________________________________
> 


Mime
View raw message