cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyler Hobbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9341) IndexOutOfBoundsException on server when unlogged batch write times out
Date Tue, 12 May 2015 20:49:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540681#comment-14540681
] 

Tyler Hobbs commented on CASSANDRA-9341:
----------------------------------------

With the way that message decoding is implemented, it's difficult to distinguish between internal
errors (i.e. C* bugs) and malformed messages.  When we don't know what the problem is, we
default to error code 0 (internal server error).

Now, we _could_ assume that Cassandra's message decoding is bug-free, and always return a
ProtocolError to the client if there is a problem decoding a message.  However, we would need
to add a lot more fine-grained error handling to the decoding logic to make the error messages
useful at all.  Given that very few people are writing drivers, I'm not sure that's worth
the effort right now.

bq.  I'm not sure if the database should throw an IndexOutOfBoundsException back to the client
(is this a security issue?)

I don't believe there are any security concerns here.  However, just for the sake of message
cleanliness, we could potentially return a generic message like "there was an error decoding
the message, check your server logs" and simply log the exception details whenever we catch
an unexpected exception.

> IndexOutOfBoundsException on server when unlogged batch write times out
> -----------------------------------------------------------------------
>
>                 Key: CASSANDRA-9341
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9341
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Ubuntu 14.04 LTS 64bit
> Cassandra 2.1.5
>            Reporter: Nimi Wariboko Jr.
>            Assignee: Tyler Hobbs
>            Priority: Minor
>             Fix For: 2.1.x
>
>
> In our application (golang) we were debugging an issue that caused our entire app to
lockup (I think this is community-driver related, and has little to do with the server).
> What caused this issue is we were rapidly sending large batches - and (pretty rarely)
one of these write requests would timeout. I think what may have happened is the we end up
writing incomplete data to the server.
> When this happens we get this response frame from the server
> This is with the native protocol version 2
> {code}
>  flags=0x0 
> stream=9 
> op=ERROR 
> length=107
> Error Code: 0
> Message: java.lang.IndexOutOfBoundsException: index: 1408818, length: 1375797264 (expected:
range(0, 1506453))
> {code}
> And in the Cassandra logs on that node:
> {code}
> ERROR [SharedPool-Worker-28] 2015-05-10 22:32:15,242 Message.java:538 - Unexpected exception
during request; channel = [id: 0x68d4acfb, /10.129.196.41:33549 => /10.129.196.24:9042]
> java.lang.IndexOutOfBoundsException: index: 1408818, length: 1375797264 (expected: range(0,
1506453))
> 	at io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1143) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.buffer.SlicedByteBuf.slice(SlicedByteBuf.java:155) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.buffer.AbstractByteBuf.readSlice(AbstractByteBuf.java:669) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at org.apache.cassandra.transport.CBUtil.readValue(CBUtil.java:336) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at org.apache.cassandra.transport.CBUtil.readValueList(CBUtil.java:386) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at org.apache.cassandra.transport.messages.BatchMessage$1.decode(BatchMessage.java:64)
~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at org.apache.cassandra.transport.messages.BatchMessage$1.decode(BatchMessage.java:45)
~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:247) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at org.apache.cassandra.transport.Message$ProtocolDecoder.decode(Message.java:235) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:722)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
> ERROR [SharedPool-Worker-28] 2015-05-10 22:32:15,248 Message.java:538 - Unexpected exception
during request; channel = [id: 0x68d4acfb, /10.129.196.41:33549 => /10.129.196.24:9042]
> io.netty.handler.codec.DecoderException: org.apache.cassandra.transport.ProtocolException:
Invalid or unsupported protocol version: 110
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:280)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:149)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:722)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
> Caused by: org.apache.cassandra.transport.ProtocolException: Invalid or unsupported protocol
version: 110
> 	at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:184) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:249)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	... 10 common frames omitted
> ERROR [SharedPool-Worker-22] 2015-05-10 22:32:15,260 Message.java:538 - Unexpected exception
during request; channel = [id: 0x68d4acfb, /10.129.196.41:33549 => /10.129.196.24:9042]
> io.netty.handler.codec.DecoderException: org.apache.cassandra.transport.ProtocolException:
Invalid or unsupported protocol version: 110
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:280)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:149)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:722)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
> Caused by: org.apache.cassandra.transport.ProtocolException: Invalid or unsupported protocol
version: 110
> 	at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:184) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:249)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	... 10 common frames omitted
> ERROR [SharedPool-Worker-19] 2015-05-10 22:32:15,260 Message.java:538 - Unexpected exception
during request; channel = [id: 0x68d4acfb, /10.129.196.41:33549 => /10.129.196.24:9042]
> io.netty.handler.codec.DecoderException: org.apache.cassandra.transport.ProtocolException:
Invalid or unsupported protocol version: 110
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:280)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:149)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:722)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
> Caused by: org.apache.cassandra.transport.ProtocolException: Invalid or unsupported protocol
version: 110
> 	at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:184) ~[apache-cassandra-2.1.5.jar:2.1.5]
> 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:249)
~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	... 10 common frames omitted
> ... repeated a couple more times ... 
> {code}
> I'm ultimately unfamiliar with what should happen here, but I'm not sure if the database
should throw an IndexOutOfBoundsException back to the client (is this a security issue?) In
any case I wanted to bring up this issue just in case if this exception is something that
shouldn't happen in normal operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message