cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasileios Vlachos <vasileiosvlac...@gmail.com>
Subject Re: Thrift version and OOM errors
Date Wed, 04 Jul 2012 12:02:06 GMT
We also get negative message lengths occasionally... Please see below:

ERROR 12:49:00,777 Thrift error occurred during processing of message.
org.apache.thrift.TException: Negative length: -2147483634
        at
org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388)
        at
org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
        at org.apache.cassandra.thrift.Column.read(Column.java:528)
        at
org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
        at org.apache.cassandra.thrift.Mutation.read(Mutation.java:353)
        at
org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:18966)
        at
org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3441)
        at
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
        at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Any ideas what could be causing strange message lengths?

Thanks,

Vasilis




On Wed, Jul 4, 2012 at 12:55 PM, Vasileios Vlachos <
vasileiosvlachos@gmail.com> wrote:

> Hello Aaron, thanks for your email.
>
> - That's pretty small, try m1.xlarge.
>
> Yes, this is small. We are aware of that, but that doesn't seem to be the
> actual problem. But we cannot see any reason why this shouldn't work as a
> test environment. After we get a fair understanding we are going to invest
> on proper hardware.
>
> - 1.0.7 ships with thrift  0.6
> - What client are you using ? If you have rolled your own client try using one of
> - the pre-built ones to rule out errors in your code.
>
> So, we are now using the right thrift version I guess, unless there are
> significant changes between 0.6.1 and 0.6. But if that's the case, why are
> we still getting 'old-client' errors???
>
> At the moment we use thrift directly. We might start developing our own
> client using C#.
>
> - mmm 1.83 GB message size. Something is not right there.
>
> Do you have any ideas what could be causing that? We are definitely not trying to store
such a large message.
>
> - 208 MB message size which is too big (max is 16MB) followed by out of memory.
>
> We cannot figure out why messages appear to be so large. We are aware of
> the 16MB limit and we are not even close to that limit. What could be
> causing such a large message size?
>
> - Do you get these errors with a stock 1.0.X install and a pre-built client ?
>
> We have not tested it with a higher level client yet. Do you think we
> should not be using thrift alone? Could that be what causes all these
> errors?
>
> Thanks in advance for your help,
>
> Regards,
>
> Vasilis
>
>
>
> On Wed, Jul 4, 2012 at 11:54 AM, aaron morton <aaron@thelastpickle.com>wrote:
>
>> We are using Cassandra 1.0.7 on AWS on mediums (that is 3.8G RAM, 1 Core),
>>
>> That's pretty small, try m1.xlarge.
>>
>> e are still not sure what version of thrift to use with Cassandra 1.0.7
>> (we are still getting the same message regarding the 'old client').
>>
>> 1.0.7 ships with thrift  0.6
>> What client are you using ? If you have rolled your own client try using
>> one of the pre-built ones to rule out errors in your code.
>>
>> org.apache.thrift.TException: Message length exceeded: 1970238464
>>
>> mmm 1.83 GB message size. Something is not right there.
>>
>>
>> org.apache.thrift.TException: Message length exceeded: 218104076
>>
>> 208 MB message size which is too big (max is 16MB) followed by out of
>> memory.
>>
>> Do you get these errors with a stock 1.0.X install and a pre-built client
>> ?
>>
>> Cheers
>>
>>
>>   -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 3/07/2012, at 9:57 AM, Vasileios Vlachos wrote:
>>
>> Hello All,
>>
>> We are using Cassandra 1.0.7 on AWS on mediums (that is 3.8G RAM, 1
>> Core), running Ubuntu 12.04. We have three nodes in the cluster and we hit
>> only one node from our application. Thrift version is 0.6.1 (we changed
>> from 0.8 because we thought there was a compatibility problem between
>> thrift and Cassandra ('old client' according to the output.log). We are
>> still not sure what version of thrift to use with Cassandra 1.0.7 (we are
>> still getting the same message regarding the 'old client'). I would
>> appreciate any help on that please.
>>
>> Below, I am sharing the errors we are getting from the output.log file.
>> First three errors are not responsible for the crash, only the OOM error
>> is, but something seems to be really wrong there...
>>
>> Error #1
>>
>> ERROR 14:00:12,057 Thrift error occurred during processing of message.
>> org.apache.thrift.TException: Message length exceeded: 1970238464
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>> at org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:102)
>> at org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:112)
>> at org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:112)
>> at org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:112)
>> at org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:121)
>> at org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60)
>> at org.apache.cassandra.thrift.Mutation.read(Mutation.java:355)
>> at
>> org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:18966)
>> at
>> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3441)
>> at
>> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
>> at
>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>>
>> Error #2
>>
>> ERROR 14:03:48,004 Error occurred during processing of message.
>> java.lang.StringIndexOutOfBoundsException: String index out of range: -
>> 2147418111
>> at java.lang.String.checkBounds(String.java:397)
>> at java.lang.String.<init>(String.java:442)
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.readString(TBinaryProtocol.java:339)
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:210)
>> at
>> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
>> at
>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>>
>> Error #3
>>
>> ERROR 14:07:24,415 Thrift error occurred during processing of message.
>> org.apache.thrift.protocol.TProtocolException: Missing version in
>> readMessageBegin, old client?
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:213)
>> at
>> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
>> at
>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>>
>> Error #4
>>
>> ERROR 16:07:10,168 Thrift error occurred during processing of message.
>> org.apache.thrift.TException: Message length exceeded: 218104076
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:352)
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.readString(TBinaryProtocol.java:347)
>> at
>> org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:18958)
>> at
>> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3441)
>> at
>> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
>> at
>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> java.lang.OutOfMemoryError: Java heap space
>> Dumping heap to /var/lib/cassandra/java_1341224307.hprof ...
>> INFO 16:07:18,882 GC for Copy: 886 ms for 1 collections, 2242700896used; max is 2670985216
>> Java HotSpot(TM) 64-Bit Server VM warning: record is too large
>> Heap dump file created [4429997807 bytes in 95.755 secs]
>> INFO 16:08:54,749 GC for ConcurrentMarkSweep: 1157 ms for 4 collections,
>> 2246857528 used; max is 2670985216
>> WARN 16:08:54,761 Heap is 0.8412092715978552 full. You may need to reduce
>> memtable and/or cache sizes.
>> Cassandra will now flush up to the two largest memtables to free up
>> memory.
>> Adjust flush_largest_memtables_at threshold in cassandra.yaml if you
>> don't want Cassandra to do this automatically
>> ERROR 16:08:54,761 Fatal exception in thread Thread[Thrift:446,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>> at java.util.HashMap.<init>(HashMap.java:187)
>> at java.util.HashMap.<init>(HashMap.java:199)
>> at
>> org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:18953)
>> at
>> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3441)
>> at
>> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
>> at
>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> INFO 16:08:54,760 InetAddress /10.128.16.110 is now dead.
>> INFO 16:08:54,764 InetAddress /10.128.16.112 is now dead.
>>
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> First three errors appear a lot of times before error #4, which actually
>> causes the crash. 10.128.16.110 is the node our application hits. Although
>> the log suggests that 10.128.16.112 died, it did not. We ran 'nodetool
>> ring' on 10.128.16.112 and only 10.128.16.110 appeared to be down.
>>
>> Proper hardware might solve some of our problems, but we need a fair
>> understanding before we move on. At the moment we cannot get a stable
>> cluster for more than 12 hours. After that, 10.128.16.110 dies and the
>> output.log has the same errors.
>>
>> Any help would be much appreciated. Please, let me know if you need more
>> information in order to figure out what is going on.
>>
>> Thank you in advance.
>>
>> --
>> Kind Regards,
>>
>> Vasilis
>>
>>
>>
>

Mime
View raw message