incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleksandr Petrov <oleksandr.pet...@gmail.com>
Subject Re: Thrift message length exceeded
Date Sat, 20 Apr 2013 13:05:55 GMT
Tried to isolate the issue in testing environment,

What I currently have:

That's a setup for test:
CREATE KEYSPACE cascading_cassandra WITH replication = {'class' :
'SimpleStrategy', 'replication_factor' : 1};
USE cascading_cassandra;
CREATE TABLE libraries (emitted_at timestamp, additional_info varchar,
environment varchar, application varchar, type varchar, PRIMARY KEY
(application, environment, type, emitted_at)) WITH COMPACT STORAGE;

Next, insert some test data:

(just for example)
[INSERT INTO libraries (application, environment, type, additional_info,
emitted_at) VALUES (?, ?, ?, ?, ?); [app env type 0 #inst
"2013-04-20T13:01:04.935-00:00"]]

If keys (e.q. "app" "env" "type") are all same across the dataset, it works
correctly.
As soon as I start varying keys, e.q. "app1", "app2", "app3" or others, I
get the error with Message Length Exceeded.

Does anyone have some ideas?
Thanks for help!


On Sat, Apr 20, 2013 at 1:56 PM, Oleksandr Petrov <
oleksandr.petrov@gmail.com> wrote:

> I can confirm running same problem.
>
> Tried ConfigHelper.setThriftMaxMessageLengthInMb();, and tuning server
> side, reducing/increasing batch size.
>
> Here's stacktrace from Hadoop/Cassandra, maybe it could give a hint:
>
> Caused by: org.apache.thrift.protocol.TProtocolException: Message length
> exceeded: 8
> at
> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>
> at
> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
> at org.apache.cassandra.thrift.Column.read(Column.java:528)
>  at
> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
> at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>  at
> org.apache.cassandra.thrift.Cassandra$get_paged_slice_result.read(Cassandra.java:14157)
> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>  at
> org.apache.cassandra.thrift.Cassandra$Client.recv_get_paged_slice(Cassandra.java:769)
> at
> org.apache.cassandra.thrift.Cassandra$Client.get_paged_slice(Cassandra.java:753)
>  at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$WideRowIterator.maybeInit(ColumnFamilyRecordReader.java:438)
>
>
> On Thu, Apr 18, 2013 at 12:34 AM, Lanny Ripple <lanny@spotright.com>wrote:
>
>> It's slow going finding the time to do so but I'm working on that.
>>
>> We do have another table that has one or sometimes two columns per row.
>>  We can run jobs on it without issue.  I looked through
>> org.apache.cassandra.hadoop code and don't see anything that's really
>> changed since 1.1.5 (which was also using thrift-0.7) so something of a
>> puzzler about what's going on.
>>
>>
>> On Apr 17, 2013, at 2:47 PM, aaron morton <aaron@thelastpickle.com>
>> wrote:
>>
>> > Can you reproduce this in a simple way ?
>> >
>> > Cheers
>> >
>> > -----------------
>> > Aaron Morton
>> > Freelance Cassandra Consultant
>> > New Zealand
>> >
>> > @aaronmorton
>> > http://www.thelastpickle.com
>> >
>> > On 18/04/2013, at 5:50 AM, Lanny Ripple <lanny@spotright.com> wrote:
>> >
>> >> That was our first thought.  Using maven's dependency tree info we
>> verified that we're using the expected (cass 1.2.3) jars
>> >>
>> >> $ mvn dependency:tree | grep thrift
>> >> [INFO] |  +- org.apache.thrift:libthrift:jar:0.7.0:compile
>> >> [INFO] |  \- org.apache.cassandra:cassandra-thrift:jar:1.2.3:compile
>> >>
>> >> I've also dumped the final command run by the hadoop we use (CDH3u5)
>> and verified it's not sneaking thrift in on us.
>> >>
>> >>
>> >> On Tue, Apr 16, 2013 at 4:36 PM, aaron morton <aaron@thelastpickle.com>
>> wrote:
>> >> Can you confirm the you are using the same thrift version that ships
>> 1.2.3 ?
>> >>
>> >> Cheers
>> >>
>> >> -----------------
>> >> Aaron Morton
>> >> Freelance Cassandra Consultant
>> >> New Zealand
>> >>
>> >> @aaronmorton
>> >> http://www.thelastpickle.com
>> >>
>> >> On 16/04/2013, at 10:17 AM, Lanny Ripple <lanny@spotright.com> wrote:
>> >>
>> >>> A bump to say I found this
>> >>>
>> >>>
>> http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded
>> >>>
>> >>> so others are seeing similar behavior.
>> >>>
>> >>> From what I can see of org.apache.cassandra.hadoop nothing has
>> changed since 1.1.5 when we didn't see such things but sure looks like
>> there's a bug that's slipped in (or been uncovered) somewhere.  I'll try to
>> narrow down to a dataset and code that can reproduce.
>> >>>
>> >>> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <lanny@spotright.com>
>> wrote:
>> >>>
>> >>>> We are using Astyanax in production but I cut back to just Hadoop
>> and Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
>> >>>>
>> >>>> We do have some extremely large rows but we went from everything
>> working with 1.1.5 to almost everything carping with 1.2.3.  Something has
>> changed.  Perhaps we were doing something wrong earlier that 1.2.3 exposed
>> but surprises are never welcome in production.
>> >>>>
>> >>>> On Apr 10, 2013, at 8:10 AM, <moshe.kranc@barclays.com> wrote:
>> >>>>
>> >>>>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from
>> hector 0.6 to 0.8
>> >>>>> Turns out the Thrift message really was too long.
>> >>>>> The mystery to me: Why no complaints in previous versions? Were
>> some checks added in Thrift or Hector?
>> >>>>>
>> >>>>> -----Original Message-----
>> >>>>> From: Lanny Ripple [mailto:lanny@spotright.com]
>> >>>>> Sent: Tuesday, April 09, 2013 6:17 PM
>> >>>>> To: user@cassandra.apache.org
>> >>>>> Subject: Thrift message length exceeded
>> >>>>>
>> >>>>> Hello,
>> >>>>>
>> >>>>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We
ran
>> sstableupgrades and got the ring on its feet and we are now seeing a new
>> issue.
>> >>>>>
>> >>>>> When we run MapReduce jobs against practically any table we
find
>> the following errors:
>> >>>>>
>> >>>>> 2013-04-09 09:58:47,746 INFO
>> org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>> >>>>> 2013-04-09 09:58:47,899 INFO
>> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=MAP, sessionId=
>> >>>>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree:
>> setsid exited with exit code 0
>> >>>>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:
 Using
>> ResourceCalculatorPlugin :
>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
>> >>>>> 2013-04-09 09:58:50,475 INFO
>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>> with mapRetainSize=-1 and reduceRetainSize=-1
>> >>>>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child:
Error
>> running child
>> >>>>> java.lang.RuntimeException: org.apache.thrift.TException: Message
>> length exceeded: 106
>> >>>>>   at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
>> >>>>>   at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
>> >>>>>   at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
>> >>>>>   at
>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>> >>>>>   at
>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>> >>>>>   at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
>> >>>>>   at
>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
>> >>>>>   at
>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
>> >>>>>   at
>> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>> >>>>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>> >>>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>> >>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>> >>>>>   at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>> >>>>>   at java.security.AccessController.doPrivileged(Native Method)
>> >>>>>   at javax.security.auth.Subject.doAs(Subject.java:396)
>> >>>>>   at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>> >>>>>   at org.apache.hadoop.mapred.Child.main(Child.java:260)
>> >>>>> Caused by: org.apache.thrift.TException: Message length exceeded:
>> 106
>> >>>>>   at
>> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>> >>>>>   at
>> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>> >>>>>   at org.apache.cassandra.thrift.Column.read(Column.java:528)
>> >>>>>   at
>> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
>> >>>>>   at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>> >>>>>   at
>> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
>> >>>>>   at
>> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>> >>>>>   at
>> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
>> >>>>>   at
>> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
>> >>>>>   at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
>> >>>>>   ... 16 more
>> >>>>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task:
>> Runnning cleanup for the task
>> >>>>>
>> >>>>> The message length listed on each failed job differs (not always
>> 106).  Jobs that used to run fine now fail with code compiled against cass
>> 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3
>> servers in production).  I'm using the following setup to configure the job:
>> >>>>>
>> >>>>> def cassConfig(job: Job) {
>> >>>>>  val conf = job.getConfiguration()
>> >>>>>
>> >>>>>  ConfigHelper.setInputRpcPort(conf, "" + 9160)
>> >>>>>  ConfigHelper.setInputInitialAddress(conf, Config.hostip)
>> >>>>>
>> >>>>>  ConfigHelper.setInputPartitioner(conf,
>> "org.apache.cassandra.dht.RandomPartitioner")
>> >>>>>  ConfigHelper.setInputColumnFamily(conf, Config.keyspace,
>> Config.cfname)
>> >>>>>
>> >>>>>  val pred = {
>> >>>>>    val range = new SliceRange()
>> >>>>>      .setStart("".getBytes("UTF-8"))
>> >>>>>      .setFinish("".getBytes("UTF-8"))
>> >>>>>      .setReversed(false)
>> >>>>>      .setCount(4096 * 1000)
>> >>>>>
>> >>>>>    new SlicePredicate().setSlice_range(range)
>> >>>>>  }
>> >>>>>
>> >>>>>  ConfigHelper.setInputSlicePredicate(conf, pred)
>> >>>>> }
>> >>>>>
>> >>>>> The job consists only of a mapper that increments counters for
each
>> row and associated columns so all I'm really doing is exercising
>> ColumnFamilyRecordReader.
>> >>>>>
>> >>>>> Has anyone else seen this?  Is there a workaround/fix to get
our
>> jobs running?
>> >>>>>
>> >>>>> Thanks
>> >>>>> _______________________________________________
>> >>>>>
>> >>>>> This message may contain information that is confidential or
>> privileged. If you are not an intended recipient of this message, please
>> delete it and any attachments, and notify the sender that you have received
>> it in error. Unless specifically stated in the message or otherwise
>> indicated, you may not duplicate, redistribute or forward this message or
>> any portion thereof, including any attachments, by any means to any other
>> person, including any retail investor or customer. This message is not a
>> recommendation, advice, offer or solicitation, to buy/sell any product or
>> service, and is not an official confirmation of any transaction. Any
>> opinions presented are solely those of the author and do not necessarily
>> represent those of Barclays.
>> >>>>>
>> >>>>> This message is subject to terms available at:
>> www.barclays.com/emaildisclaimer and, if received from Barclays' Sales
>> or Trading desk, the terms available at:
>> www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays
>> you consent to the foregoing. Barclays Bank PLC is a company registered in
>> England (number 1026167) with its registered office at 1 Churchill Place,
>> London, E14 5HP. This email may relate to or be sent from other members of
>> the Barclays group.
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>
>> >>>
>> >>
>> >>
>> >
>>
>>
>
>
> --
> alex p
>



-- 
alex p

Mime
View raw message