cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Eriksson (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-13619) java.nio.BufferOverflowException: null while flushing hints
Date Tue, 12 Sep 2017 10:48:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162780#comment-16162780
] 

Marcus Eriksson edited comment on CASSANDRA-13619 at 9/12/17 10:47 AM:
-----------------------------------------------------------------------

In {{PartitionUpdate}}, {{isBuilt}} is non-volatile, and it is set once the {{Holder}} ref
has been updated. {{build()}} is synchronized but {{maybeBuild()}} where we check {{isBuilt}}
is not. That means this is basically double checked locking, but since {{isBuilt}} is not
volatile, the assignments in {{build()}} can be reordered, making {{isBuilt}} true before
{{holder}} is assigned.

It stops reproducing if I set {{isBuilt = this.holder != null}} instead of {{isBuilt = true}}
to make sure that {{holder}} is set before {{isBuilt}} but making {{isBuilt}} volatile should
be the correct solution.


was (Author: krummas):
In {{PartitionUpdate}}, {{isBuilt}} is non-volatile, and it is set once the {{Holder}} ref
has been updated. {{build()}} is synchronized but {{maybeBuild()}} where we check {{isBuilt}}
is not. That means this is basically double checked locking, but since {{isBuilt}} is not
volatile, the assignments in {{build()}} can be reordered, making {{isBuilt}} set before {{holder}
is assigned.

It stops reproducing if I set {{isBuilt = this.holder != null}} instead of {{isBuilt = true}}
to make sure that {{holder}} is set before {{isBuilt}} but making {{isBuilt}} volatile should
be the correct solution.

> java.nio.BufferOverflowException: null while flushing hints
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-13619
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13619
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Coordination, Core
>            Reporter: Milan Milosevic
>            Assignee: Marcus Eriksson
>
> I'm seeing the following exception running Cassandra 3.0.11 on 21 node cluster in two
AWS regions when half of the nodes in one region go down, and the load is high on the rest
of the nodes:
> {code}
> WARN  [SharedPool-Worker-10] 2017-06-14 12:57:15,017 AbstractLocalAwareExecutorService.java:169
- Uncaught exception on thread Thread[SharedPool-Worker-10,5,main]: {}
> java.lang.RuntimeException: java.nio.BufferOverflowException
>         at org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:2549)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0-zing_17.03.1.0]
>         at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-3.0.11.jar:3.0.11]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0-zing_17.03.1.0]
> Caused by: java.nio.BufferOverflowException: null
>         at org.apache.cassandra.io.util.DataOutputBufferFixed.doFlush(DataOutputBufferFixed.java:52)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:195)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.writeUnsignedVInt(BufferedDataOutputStreamPlus.java:258)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.utils.ByteBufferUtil.writeWithVIntLength(ByteBufferUtil.java:296)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.db.Columns$Serializer.serialize(Columns.java:405) ~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.db.SerializationHeader$Serializer.serializeForMessaging(SerializationHeader.java:407)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:120)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serialize(PartitionUpdate.java:625)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:305)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.hints.Hint$Serializer.serialize(Hint.java:141) ~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.hints.HintsBuffer$Allocation.write(HintsBuffer.java:251)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.hints.HintsBuffer$Allocation.write(HintsBuffer.java:230)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.hints.HintsBufferPool.write(HintsBufferPool.java:61)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.hints.HintsService.write(HintsService.java:154) ~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.service.StorageProxy$11.runMayThrow(StorageProxy.java:2627)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         at org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:2545)
~[apache-cassandra-3.0.11.jar:3.0.11]
>         ... 5 common frames omitted
> {code}
> Relevant configurations from cassandra.yaml:
> {code}
> -cassandra_hinted_handoff_throttle_in_kb: 1024
>  cassandra_max_hints_delivery_threads: 4
> -cassandra_hints_flush_period_in_ms: 10000
> -cassandra_max_hints_file_size_in_mb: 512
> {code}
> When I reduce -cassandra_hints_flush_period_in_ms: 10000 to 5000, the number of exceptions
lowers significantly, but they are still present.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message