cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11594) Too many open files on directories
Date Wed, 24 Aug 2016 10:40:21 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434675#comment-15434675
] 

Stefania commented on CASSANDRA-11594:
--------------------------------------

I may have found something that could explain this and here is the patch:

||3.0||trunk||
|[patch|https://github.com/stef1927/cassandra/commits/11594-3.0]|[patch|https://github.com/stef1927/cassandra/commits/11594]|
|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11594-3.0-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11594-testall/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11594-3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11594-dtest/]|

However, I am trying to reproduce the problem with a [test|https://github.com/stef1927/cassandra-dtest/commit/c148972797c725a37ee8d2d59119039e32149f0f#diff-60812631a43b8e1f0c9fb53d9f7487ebR208]
and I am failing to do so. I think I need more data. I'll continue tomorrow.

> Too many open files on directories
> ----------------------------------
>
>                 Key: CASSANDRA-11594
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11594
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: n0rad
>            Assignee: Stefania
>            Priority: Critical
>         Attachments: Grafana   Cassandra   Cluster.png, openfiles.zip, screenshot.png
>
>
> I have a 6 nodes cluster in prod in 3 racks.
> each node :
> - 4Gb commitlogs on 343 files
> - 275Gb data on 504 files 
> On saturday, 1 node in each rack crash with with too many open files (seems to be the
similar node in each rack).
> {code}
> lsof -n -p $PID give me 66899 out of 65826 max
> {code}
> it contains 64527 open directories (2371 uniq)
> a part of the list :
> {code}
> java    19076 root 2140r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2141r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2142r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2143r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2144r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2145r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2146r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2147r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2148r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2149r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2150r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2151r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2152r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2153r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2154r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> java    19076 root 2155r      DIR   8,17      143360 4386718705 /opt/stage2/pod-cassandra-aci-cassandra/rootfs/data/keyspaces/email_logs_query/emails-2d4abd00e9ea11e591199d740e07bd95
> {code}
> The 3 others nodes crashes 4 hours later



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message