cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Young <iyo...@ratespecial.com>
Subject Re: services not running after reboot
Date Tue, 14 Oct 2014 00:48:39 GMT
I didn't find anything like that.  Everything's been runnin ok over the
weekend so I will leave it as is.

On Mon, Oct 13, 2014 at 2:18 AM, Daan Hoogland <daan.hoogland@gmail.com>
wrote:

> Good going Ian, sorry you didn't get any assistance on the way. Did you
> find a setting that should have a different default? Like the router
> service offering memory :P or doesn't that make any sense?
>
> On Sat, Oct 11, 2014 at 5:11 AM, Ian Young <iyoung@ratespecial.com> wrote:
>
> > Aha!  I restarted cloudstack-agent, which caused the virtual router to
> > change to a "stopped" status in the management console.  However, the
> > console viewer icon was still visible, so I clicked it.  The router had
> run
> > out of memory and caused a kernel panic.  I created a new system service
> > offering with 500 MB of memory, changed the router's service offering,
> and
> > started it.  It booted with no problem.  The default memory size of 128
> MB
> > is not enough.  This is the system VM template I was using:
> >
> >
> >
> http://cloudstack.apt-get.eu/systemvm/4.4/systemvm64template-4.4.0-6-kvm.qcow2.bz2
> >
> > On Fri, Oct 10, 2014 at 7:28 PM, Ian Young <iyoung@ratespecial.com>
> wrote:
> >
> > > I dropped all the cloud* databases, deleted everything in primary and
> > > secondary storage, and reinstalled the management server, following the
> > > guide I wrote for myself the last time I built a stable CloudStack
> > system.
> > > Then I imported one of my backed up instances as a template and tried
> to
> > > create a new VM.  Same problem as before.  How is this possible?
> > >
> > > 2014-10-10 19:17:44,075 WARN  [kvm.resource.LibvirtComputingResource]
> > > (agentRequest-Handler-3:null) Timed out:
> > > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/
> patchviasocket.pl
> > > -n r-4-VM -p
> > >
> >
> %template=domP%name=r-4-VM%eth0ip=192.168.102.222%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
> > > lax.ratespecial.com
> >
> %cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.0.33%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
> > > .  Output is:
> > > 2014-10-10 19:18:05,078 WARN  [kvm.resource.LibvirtComputingResource]
> > > (Script-3:null) Interrupting script.
> > >
> > > On Fri, Oct 10, 2014 at 4:33 PM, Ian Young <iyoung@ratespecial.com>
> > wrote:
> > >
> > >> I've restarted all the services and restarted the servers too.  The
> SSVM
> > >> and CP start with no trouble.  Every time I try to start or create an
> > >> instance, I see repeated messages like these:
> > >>
> > >> /var/log/cloudstack/agent/cloudstack-agent.out:
> > >> 2014-10-10 16:27:21,841{GMT} WARN
> > >>  [kvm.resource.LibvirtComputingResource] (Script-8:) Interrupting
> > script.
> > >> 2014-10-10 16:27:21,841{GMT} WARN
> > >>  [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-4:)
> Timed
> > >> out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/
> > >> patchviasocket.pl -n r-19-VM -p
> > >>
> >
> %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
> > >> lax.ratespecial.com
> >
> %cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
> > >> .  Output is:
> > >>
> > >> /var/log/cloudstack/agent/security_group.log:
> > >> 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next
> > time!
> > >>
> > >> On Fri, Oct 10, 2014 at 3:04 PM, Ian Young <iyoung@ratespecial.com>
> > >> wrote:
> > >>
> > >>> I tried to restart the network with the "clean up" option, via the
> web
> > >>> console.  After several minutes, it failed to restart the network.
> The
> > >>> SSVM and CP are still running but the VR no longer exists.  Why would
> > these
> > >>> be able to start but not the virtual router?
> > >>>
> > >>> On Fri, Oct 10, 2014 at 2:48 PM, Ian Young <iyoung@ratespecial.com>
> > >>> wrote:
> > >>>
> > >>>> I restarted the libvirtd service and the management service is
now
> > >>>> fully started (there are services listening on ports 8250 and 9090).
> > The
> > >>>> SSVM health check script now reports no problems.
> > >>>>
> > >>>> However, I tried starting an instance and both the instance and
the
> > >>>> virtual router are in a "starting" state but have been so for almost
> > 10
> > >>>> minutes.  In the catalina.out log I see:
> > >>>> INFO  [c.c.v.VirtualMachineManagerImpl]
> (AgentManager-Handler-10:null)
> > >>>> There is pending job or HA tasks working on the VM. vm id: 4,
> postpone
> > >>>> power-change report by resetting power-change counters
> > >>>> INFO  [c.c.v.VirtualMachineManagerImpl]
> (AgentManager-Handler-10:null)
> > >>>> There is pending job or HA tasks working on the VM. vm id: 13,
> > postpone
> > >>>> power-change report by resetting power-change counters
> > >>>>
> > >>>> I'm also seeing this in the agent.log:
> > >>>> 2014-10-10 14:43:26,833 WARN
> [kvm.resource.LibvirtComputingResource]
> > >>>> (Script-6:null) Interrupting script.
> > >>>> 2014-10-10 14:43:26,833 WARN
> [kvm.resource.LibvirtComputingResource]
> > >>>> (agentRequest-Handler-2:null) Timed out:
> > >>>> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/
> > >>>> patchviasocket.pl -n r-4-VM -p
> > >>>>
> >
> %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
> > >>>> lax.ratespecial.com
> >
> %cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
> > >>>> .  Output is:
> > >>>>
> > >>>> And in the security_group.log:
> > >>>> 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck
next
> > >>>> time!
> > >>>> 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck
next
> > >>>> time!
> > >>>>
> > >>>> What does this mean?
> > >>>>
> > >>>> On Fri, Oct 10, 2014 at 2:11 PM, Ian Young <iyoung@ratespecial.com>
> > >>>> wrote:
> > >>>>
> > >>>>> This morning I was unable to start new instances.  I discovered
> that
> > I
> > >>>>> could SSH into the SSVM and the console proxy but not the virtual
> > router.
> > >>>>> Something strange was happening so I thought it might be a
good
> time
> > to
> > >>>>> gracefully stop all the instances and reboot the hypervisor
to see
> > if the
> > >>>>> VR would start working again.  I also rebooted the management
> server
> > (a
> > >>>>> separate machine) to have a clean slate.  Now that they've
both
> been
> > >>>>> rebooted, the following symptoms exist:
> > >>>>>
> > >>>>> * On the management server, there is no services listening
on 9090
> or
> > >>>>> 8250.
> > >>>>> * When I run the SSVM health check script, it says NFS is not
> > >>>>> currently mounted.
> > >>>>> * The management server log is reporting that Zone 1 is not
ready
> to
> > >>>>> launch SSVM/CP yet, even though both of those are running.
> > >>>>>
> > >>>>> The NFS server is running just fine.  I can mount it in the
> > management
> > >>>>> server with no problems.  I've restarted cloudstack-management
and
> > >>>>> cloudstack-agent but the problems persist.  The "not ready
to
> launch
> > >>>>> SSVM/CP yet" messages sounds like the management server and
the
> > hypervisor
> > >>>>> are not communicating or some information about the system
state is
> > out of
> > >>>>> sync.  How can I confirm this?
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >
> >
>
>
>
> --
> Daan
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message