cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Stepura (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-6255) Exception count not incremented on OutOfMemoryError (HSHA)
Date Fri, 17 Jan 2014 05:16:21 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874439#comment-13874439
] 

Mikhail Stepura edited comment on CASSANDRA-6255 at 1/17/14 5:16 AM:
---------------------------------------------------------------------

The problem is that {{CustomTHsHaServer.SelectorThread}} swallows all Throwables (including
{{Error}} ) in its {{run}} method. Probably that was done for a reason. 
I guess it would be reasonable to increment the exceptions counter in that {{catch}} block,
since the exception will be swallowed, and won't be handled by global {{UncaughtExceptionHandler}}



was (Author: mishail):
The problem is that {{CustomTHsHaServer.SelectorThread}} swallows all Throwables (including
{{Error}} ) in its {{run}} method

> Exception count not incremented on OutOfMemoryError (HSHA)
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-6255
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6255
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Oracle java version "1.7.0_15"
> rpc_server_type: hsha
>            Reporter: Dan Hendry
>            Assignee: Mikhail Stepura
>            Priority: Minor
>             Fix For: 1.2.14
>
>
> One of our nodes decided to stop listening on 9160 (netstat -l was showing nothing and
telnet was reporting connection refused). Nodetool status showed no hosts down and on the
offending node nodetool info gave the following:
> {noformat}
> nodetool info
> Token            : (invoke with -T/--tokens to see all 256 tokens)
> ID               : (removed)
> Gossip active    : true
> Thrift active    : true
> Native Transport active: false
> Load             : 2.05 TB
> Generation No    : 1382536528
> Uptime (seconds) : 432970
> Heap Memory (MB) : 8098.05 / 14131.25
> Data Center      : DC1
> Rack             : RAC2
> Exceptions       : 0
> Key Cache        : size 536854996 (bytes), capacity 536870912 (bytes), 41383646 hits,
1710831591 requests, 0.024 recent hit rate, 0 save period in seconds
> Row Cache        : size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN recent
hit rate, 0 save period in seconds
> {noformat}
> After looking at the cassandra log, I saw a bunch of the following:
> {noformat}
> ERROR [Selector-Thread-16] 2013-10-27 17:36:00,370 CustomTHsHaServer.java (line 187)
Uncaught Exception: 
> java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:691)
>         at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
>         at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371)
>         at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.execute(DebuggableThreadPoolExecutor.java:145)
>         at org.apache.cassandra.thrift.CustomTHsHaServer.requestInvoke(CustomTHsHaServer.java:337)
>         at org.apache.cassandra.thrift.CustomTHsHaServer$SelectorThread.handleRead(CustomTHsHaServer.java:281)
>         at org.apache.cassandra.thrift.CustomTHsHaServer$SelectorThread.select(CustomTHsHaServer.java:224)
>         at org.apache.cassandra.thrift.CustomTHsHaServer$SelectorThread.run(CustomTHsHaServer.java:182)
> ERROR [Selector-Thread-7] 2013-10-27 17:36:00,370 CustomTHsHaServer.java (line 187) Uncaught
Exception: 
> java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:691)
>         at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
>         at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371)
>         at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.execute(DebuggableThreadPoolExecutor.java:145)
>         at org.apache.cassandra.thrift.CustomTHsHaServer.requestInvoke(CustomTHsHaServer.java:337)
>         at org.apache.cassandra.thrift.CustomTHsHaServer$SelectorThread.handleRead(CustomTHsHaServer.java:281)
>         at org.apache.cassandra.thrift.CustomTHsHaServer$SelectorThread.select(CustomTHsHaServer.java:224)
>         at org.apache.cassandra.thrift.CustomTHsHaServer$SelectorThread.run(CustomTHsHaServer.java:182)
> {noformat}
> There wasn't anything else overtly suspicious in the logs except for the occasional 
> {noformat}
> ERROR [Selector-Thread-0] 2013-10-27 17:35:58,662 TNonblockingServer.java (line 468)
Read an invalid frame size of 0. Are you using TFramedTransport on the client side?
> {noformat}
> but  that periodically comes up - I have looked into it before but it has never seemed
to have any serious impact.
> This ticket is not about *why* an OutOfMemoryError occurred - which is bad but I don't
think I have enough information to reproduce or speculate on a cause. This ticket is about
the fact that an OutOfMemoryError occurred and nodetool info was reporting Thrift active :
true and Exceptions : 0. 
> Our monitoring systems and investigation processes are both starting to rely on on the
exception count. The fact that it was not accurate here is disconcerting.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message