cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Dunaway <dave.duna...@gmail.com>
Subject Re: [jira] [Comment Edited] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
Date Fri, 15 Mar 2013 14:43:17 GMT
Ooops... somehow I managed to reply to an existing thread about something
else instead of a new email to the list. My apologies. I will file this bug
with my brain and apply the prepared patch 'beer.patch' ASAP!



On Fri, Mar 15, 2013 at 10:41 AM, Dave Dunaway <dave.dunaway@gmail.com>wrote:

>
>
> Hey guys,
>
> We ran across a bug in Cloudplatform from Citrix that may exist in the
> Apache version as well, so here's my attempt to throw this out there for
> someone to verify.
>
> We had someone add a network where he entered the gateway IP as
> 192.168.100.08 ... note the last octet... '.08'. Cloudstack/CloudPortal ate
> the IP OK and processed the request and created an interface on the VR with
> that IP, but hosts on that network, when attempting to get a DHCP lease
> would never get answers as dnsmasq didn't like the IP address
> 192.168.100.08 ... it was freaking out due to syntax failure.
>
> so there's two issues I see that need to be checked:
>
> 1) That Cloudstack validates IPs for correctness (how 90's is that?!
> sigh...)
> 2) Error checking on the /root/edithosts.sh needs to be better as it
> appears to not exist.
>
> Anyhoot, I'll skip the QA rant, but hopefully these silly simple bug don't
> exist in the apache version.
> But it may be useful to check if they do. I would think there's  a lot of
> places where IPs should validated
> and likely aren't. And I won't get into other data that a user may enter
> (ie:  'banana; drop database cloud;' :P)
>
> FRIDAY! WOO!
>
> evad
>
>
>
>
> On Fri, Mar 15, 2013 at 12:49 AM, Jason Davis <scr512@gmail.com> wrote:
>
>> Bumping this thread. Adding in users to see if anyone else has seen this.
>>
>> I am running into the exact same issue. XenServer 6.0.2, Basic networking
>> with bridging with CSP installed. However I am using CS 4.0.1.
>>
>> My issues arised after I rebooted my XS host outside of CS (I believe this
>> was due to inode exhaustion although i didn't realize this until later)
>> Upon start CS seemingly connects to the host for an indefinite amount of
>> time. Unfortunately nothing in logs explains why its behaving like this
>> (CS
>> management-server.log)
>>
>> So far I've tried setting the networking from bridge->ovs->bridge and
>> reinstalling the CSP with no success.
>>
>>
>> On Wed, Dec 5, 2012 at 6:08 PM, Jason Bausewein (JIRA) <jira@apache.org
>> >wrote:
>>
>> >
>> >     [
>> >
>> https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510950#comment-13510950
>> ]
>> >
>> > Jason Bausewein edited comment on CLOUDSTACK-105 at 12/6/12 12:07 AM:
>> > ----------------------------------------------------------------------
>> >
>> > I am able to reproduce this issue consistently.  I have a single xen
>> host
>> > in a basic zone.  I tracked it down to the following process creating
>> the
>> > stream-unix.****.**** files about every 10 seconds.
>> >
>> > root      8237  8223  0 09:59 ?        00:00:00 ovs-vsctl add-br xapi0
>> >
>> > Dec  5 09:59:15 xenserver1 ovs-vsctl: 00001|vsctl|INFO|Called as
>> ovs-vsctl
>> > add-br xapi0
>> > Dec  5 09:59:15 xenserver1 ovs-vsctl:
>> > 00002|stream_unix|ERR|/tmp/stream-unix.8237.0: connection to
>> > /var/run/openvswitch/db.sock failed: No such file or directory
>> > Dec  5 09:59:15 xenserver1 ovs-vsctl:
>> > 00003|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection
>> attempt
>> > failed (No such file or directory)
>> > Dec  5 09:59:16 xenserver1 ovs-vsctl:
>> > 00004|stream_unix|ERR|/tmp/stream-unix.8237.1: connection to
>> > /var/run/openvswitch/db.sock failed: No such file or directory
>> > Dec  5 09:59:16 xenserver1 ovs-vsctl:
>> > 00005|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection
>> attempt
>> > failed (No such file or directory)
>> > Dec  5 09:59:18 xenserver1 ovs-vsctl:
>> > 00006|stream_unix|ERR|/tmp/stream-unix.8237.2: connection to
>> > /var/run/openvswitch/db.sock failed: No such file or directory
>> > Dec  5 09:59:18 xenserver1 ovs-vsctl:
>> > 00007|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection
>> attempt
>> > failed (No such file or directory)
>> >
>> > These messages start immediately on boot.  I attached my messages log
>> file.
>> >
>> > I am not using openvswitch.  Is there something I need to turn off?
>> >
>> >
>> >       was (Author: jbausewein):
>> >     I am able to reproduce this issue consistently.  I have a single xen
>> > host in a basic zone.  I tracked it down to the following process
>> creating
>> > the stream-unix.****.**** files about every 10 seconds.
>> >
>> > root      8237  8223  0 09:59 ?        00:00:00 ovs-vsctl add-br xapi0
>> >
>> > Dec  5 09:59:15 xenserver1 ovs-vsctl: 00001|vsctl|INFO|Called as
>> ovs-vsctl
>> > add-br xapi0
>> > Dec  5 09:59:15 xenserver1 ovs-vsctl:
>> > 00002|stream_unix|ERR|/tmp/stream-unix.8237.0: connection to
>> > /var/run/openvswitch/db.sock failed: No such file or directory
>> > Dec  5 09:59:15 xenserver1 ovs-vsctl:
>> > 00003|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection
>> attempt
>> > failed (No such file or directory)
>> > Dec  5 09:59:16 xenserver1 ovs-vsctl:
>> > 00004|stream_unix|ERR|/tmp/stream-unix.8237.1: connection to
>> > /var/run/openvswitch/db.sock failed: No such file or directory
>> > Dec  5 09:59:16 xenserver1 ovs-vsctl:
>> > 00005|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection
>> attempt
>> > failed (No such file or directory)
>> > Dec  5 09:59:18 xenserver1 ovs-vsctl:
>> > 00006|stream_unix|ERR|/tmp/stream-unix.8237.2: connection to
>> > /var/run/openvswitch/db.sock failed: No such file or directory
>> > Dec  5 09:59:18 xenserver1 ovs-vsctl:
>> > 00007|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection
>> attempt
>> > failed (No such file or directory)
>> >
>> > I am not using openvswitch.  Is there something I need to turn off?
>> >
>> >
>> > > /tmp/stream-unix.####.###### stale sockets causing inodes to run out
>> on
>> > Xenserver
>> > >
>> >
>> ---------------------------------------------------------------------------------
>> > >
>> > >                 Key: CLOUDSTACK-105
>> > >                 URL:
>> > https://issues.apache.org/jira/browse/CLOUDSTACK-105
>> > >             Project: CloudStack
>> > >          Issue Type: Bug
>> > >      Security Level: Public(Anyone can view this level - this is the
>> > default.)
>> > >          Components: XenServer
>> > >    Affects Versions: pre-4.0.0
>> > >         Environment: Xenserver 6.0.2
>> > > Cloudstack 3.0.2
>> > >            Reporter: Caleb Call
>> > >            Assignee: Devdeep Singh
>> > >             Fix For: 4.1.0
>> > >
>> > >         Attachments: messages
>> > >
>> > >
>> > > We came across an interesting issue in one of our clusters.  We ran
>> out
>> > of inodes on all of our cluster members (since when does this happen in
>> > 2012?).  When this happened, it in turn made the / filesystem a
>> read-only
>> > filesystem which in turn made all the hosts go in to emergency
>> maintenance
>> > mode and as a result get marked down by Cloudstack.  We found that it
>> was
>> > caused by hundreds of thousands of stale socket files in /tmp named
>> > "stream-unix.####.######".  To resolve the issue, we had to delete those
>> > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {}
>> > \;), then kill and restart xapi, then correct the emergency maintenance
>> > mode.  These hosts had only been up for 45 days before this issue
>> occurred.
>> > > In our scouring of the interwebs, the only other instance we've been
>> > able to find of this (or similar) happening is in the same setup we are
>> > currently running. Xenserver 6.0.2 with CS 3.0.2.  Do these stream-unix
>> > sockets have anything to do with Cloudstack?  I would think if this was
>> a
>> > Xenserver issue (bug), there would be a lot more on the internet about
>> this
>> > happening.  For a temporary workaround, we've added a cronjob to cleanup
>> > these files but we'd really like to address the actual issue that's
>> causing
>> > these sockets to become stale and not get cleaned-up.
>> >
>> > --
>> > This message is automatically generated by JIRA.
>> > If you think it was sent incorrectly, please contact your JIRA
>> > administrators
>> > For more information on JIRA, see:
>> http://www.atlassian.com/software/jira
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message