cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Caleb Call (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
Date Wed, 22 May 2013 17:49:19 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664327#comment-13664327
] 

Caleb Call commented on CLOUDSTACK-105:
---------------------------------------

I'll be happy to attach the dump but this isn't something that just happens.  It's constantly
happening.  In order to avoid our servers from crashing, we have to have a cronjob that removes
any of these files that are older than a couple days old.  I also don't think this is necessarily
a Xenserver bug, maybe a Xenserver under CloudStack as without joining Xenserver to CloudStack,
this never happens.  Once it's joined, it starts happening.  I'm also have a suspicion it's
being caused by this script /etc/xapi.d/plugins/vmops and in particular, this part of that
script (sorry, I'm sure jira is going to munge this output):

 def setLinkLocalIP(session, args):
    brName = args['brName']
    try:
        cmd = ["ip", "route", "del", "169.254.0.0/16"]
        txt = util.pread2(cmd)
    except:
        txt = ''
    try:
        cmd = ["ifconfig", brName, "169.254.0.1", "netmask", "255.255.0.0"]
        txt = util.pread2(cmd)
    except:
        try:
            cmd = ['cat', '/etc/xensource/network.conf']
            result = util.pread2(cmd)
        except:
            return 'can not cat network.conf'

        if result.lower() == "bridge":
            try:
                cmd = ["brctl", "addbr", brName]
                txt = util.pread2(cmd)
            except:
                pass

        else:
            try:
                cmd = ["ovs-vsctl", "add-br", brName]
                txt = util.pread2(cmd)
            except:
                pass

        try:
            cmd = ["ifconfig", brName, "169.254.0.1", "netmask", "255.255.0.0"]
            txt = util.pread2(cmd)
        except:
            pass
    try:
        cmd = ["ip", "route", "add", "169.254.0.0/16", "dev", brName, "src", "169.254.0.1"]
        txt = util.pread2(cmd)
    except:
        txt = ''
    txt = 'success'
    return txt
                
> /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
> ---------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-105
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Third-Party Bugs
>    Affects Versions: pre-4.0.0
>         Environment: Xenserver 6.0.2
> Cloudstack 3.0.2
>            Reporter: Caleb Call
>            Assignee: Devdeep Singh
>             Fix For: 4.1.0
>
>         Attachments: messages
>
>
> We came across an interesting issue in one of our clusters.  We ran out of inodes on
all of our cluster members (since when does this happen in 2012?).  When this happened, it
in turn made the / filesystem a read-only filesystem which in turn made all the hosts go in
to emergency maintenance mode and as a result get marked down by Cloudstack.  We found that
it was caused by hundreds of thousands of stale socket files in /tmp named "stream-unix.####.######".
 To resolve the issue, we had to delete those stale socket files (find /tmp -name "*stream*"
-mtime +7 -exec rm -v {} \;), then kill and restart xapi, then correct the emergency maintenance
mode.  These hosts had only been up for 45 days before this issue occurred.  
> In our scouring of the interwebs, the only other instance we've been able to find of
this (or similar) happening is in the same setup we are currently running. Xenserver 6.0.2
with CS 3.0.2.  Do these stream-unix sockets have anything to do with Cloudstack?  I would
think if this was a Xenserver issue (bug), there would be a lot more on the internet about
this happening.  For a temporary workaround, we've added a cronjob to cleanup these files
but we'd really like to address the actual issue that's causing these sockets to become stale
and not get cleaned-up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message