cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hao Bryan Cheng (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10477) java.lang.AssertionError in StorageProxy.submitHint
Date Wed, 11 Nov 2015 06:37:11 GMT


Hao Bryan Cheng commented on CASSANDRA-10477:

A few additional details:

Unfortunately, I didn't get any data while the issue was happening. Afterwards, netstat, nodetool
status, etc. are all nominal.

During the period of time when this node was experiencing difficulty, no other nodes reported
any unhealthy hosts. However, we do have our phi convict threshold tuned up from 8 to 10,
due to running on AWS.

This event was localized to one node out of 12. Keyspace RF ranges from 3-5. Queries at LOCAL_QUORUM
were timing out with insufficient responses.

> java.lang.AssertionError in StorageProxy.submitHint
> ---------------------------------------------------
>                 Key: CASSANDRA-10477
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: CentOS 6, Oracle JVM 1.8.45
>            Reporter: Severin Leonhardt
>            Assignee: Ariel Weisberg
>             Fix For: 2.1.x
> A few days after updating from 2.0.15 to 2.1.9 we have the following log entry on 2 of
5 machines:
> {noformat}
> ERROR [EXPIRING-MAP-REAPER:1] 2015-10-07 17:01:08,041 - Exception
in thread Thread[EXPIRING-MAP-REAPER:1,5,main]
> java.lang.AssertionError: /
>         at org.apache.cassandra.service.StorageProxy.submitHint(
>         at$5.apply(
>         at$5.apply(
>         at org.apache.cassandra.utils.ExpiringMap$ ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$
>         at java.util.concurrent.Executors$ [na:1.8.0_45]
>         at java.util.concurrent.FutureTask.runAndReset( [na:1.8.0_45]
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(
>         at java.util.concurrent.ScheduledThreadPoolExecutor$
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
>         at java.util.concurrent.ThreadPoolExecutor$
>         at [na:1.8.0_45]
> {noformat}
> is the broadcast address of the local machine.
> When this is logged the read request latency of the whole cluster becomes very bad, from
6 ms/op to more than 100 ms/op according to OpsCenter. Clients get a lot of timeouts. We need
to restart the affected Cassandra node to get back normal read latencies. It seems write latency
is not affected.
> Disabling hinted handoff using {{nodetool disablehandoff}} only prevents the assert from
being logged. At some point the read latency becomes bad again. Restarting the node where
hinted handoff was disabled results in the read latency being better again.

This message was sent by Atlassian JIRA

View raw message