cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wido den Hollander <w...@widodh.nl>
Subject Re: [DISCUSS] getting rid of KVM patchdisk
Date Tue, 05 Mar 2013 18:41:36 GMT


On 03/05/2013 07:28 PM, Edison Su wrote:
>
>
>> -----Original Message-----
>> From: Marcus Sorensen [mailto:shadowsor@gmail.com]
>> Sent: Monday, March 04, 2013 10:26 PM
>> To: cloudstack-dev@incubator.apache.org
>> Subject: Re: [DISCUSS] getting rid of KVM patchdisk
>>
>> I've been thinking about how to tackle this, written a little concept code, and
>> it seems fairly straightforward to include our own little python daemon that
>> speaks JSON via this local character device in the system vm. I'm assuming
>> we'd start it up at the beginning of cloud-early-config.
>>
>> What I'm not certain of is how to get the 'cmdline' bits into the system before
>> cloud-early-config needs them. Do we block in cloud-early-config, waiting on
>> getting the cmdline file before continuing, and push it via StartCommand?
>
> We put a lot of logic into init scripts inside system vm, which is unnecessary complicated
the system vm programming:
> 1. init script is not portable, if people want to use other Linux distribution as system
vm OS, then he has to write his own init scripts.
> 2. init script is not easy to hack, it has its own dialect(how to log message, how to
write dependence etc)
> 3. init script is running in a limited environment(some system wide services are not
available), put the limitation on what you can do in a init script.
>
> Maybe we need to start working on new system vm programming model now? Better to just
put a python daemon inside system vm, and provide restful API through link local ip address(or
private ip if it's vmware), then mgt server or hypervisor agent code can just send commands
to the python daemon through http, instead of ssh.
>
> In your case, the python daemon, needs to wait on the well-defined serial port(e.g /dev/virtio-ports/org.apache.cloudstack.guest.agent),
get cmdline, then programming system vm itself, and reboot.
>

This seems like a very sensible thing to do. I've already created Jira 
tickets about this last November.

I haven't been able to look at it though, but have a Python daemon 
running which does everything on a Read-Only (!!!) filesystem would be 
awesome.

The reason I mention the read-only filesystem is that it would make the 
system VMs much more resilient against SAN issues. Assume a read-only FS 
and make them stateless and store everything on a tmpfs or in memory 
somewhere.

(Would be something cool to discuss at a Colab Conference ;))

Wido

>
>
>>
>> On Mon, Mar 4, 2013 at 5:27 PM, Marcus Sorensen <shadowsor@gmail.com>
>> wrote:
>>> I tested this with Rohit's systemvm from master. It works fine,
>>> provided you install the qemu-guest-agent software and modify the
>>> libvirt xml definition of the system vm to include something like:
>>>
>>>     <channel type='unix'>
>>>        <source mode='bind' path='/var/lib/libvirt/qemu/v-2-VM.agent'/>
>>>        <target type='virtio' name='org.qemu.guest_agent.0'/>
>>>        <alias name='channel0'/>
>>>        <address type='virtio-serial' controller='0' bus='0' port='1'/>
>>>      </channel>
>>>
>>> Then on the host you can connect to the
>>> /var/lib/libvirt/qemu/v-2-VM.agent unix socket and send QMP JSON to do
>>> things like write files. We can't execute the various scripts through
>>> it, but we also don't have to use qemu-ga; we could have our own thing
>>> listening on the unix socket.
>>>
>>>
>>>
>>> On Mon, Mar 4, 2013 at 3:24 PM, Marcus Sorensen
>> <shadowsor@gmail.com> wrote:
>>>> I think this just requires an updated system vm (the virtio-serial
>>>> portion). I've played a bit with the old debian 2.6.32-5-686-bigmem
>>>> one and can't get the device nodes to show up, even though the
>>>> /boot/config shows that it has CONFIG_VIRTIO_CONSOLE=y. However, if I
>>>> try this with a CentOS 6.3 VM, on a CentOS 6.3 or Ubuntu 12.04 KVM
>>>> host it works. So I'm not sure what's being used for the ipv6 update,
>>>> but we can probably make one that works. We'll need to install
>>>> qemu-ga and start it within the systemvm as well.
>>>>
>>>> On Mon, Mar 4, 2013 at 12:41 PM, Edison Su <Edison.su@citrix.com>
>> wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Marcus Sorensen [mailto:shadowsor@gmail.com]
>>>>>> Sent: Sunday, March 03, 2013 12:13 PM
>>>>>> To: cloudstack-dev@incubator.apache.org
>>>>>> Subject: [DISCUSS] getting rid of KVM patchdisk
>>>>>>
>>>>>> For those who don't know (this probably doesn't matter, but...),
>>>>>> when KVM brings up a system VM, it creates a 'patchdisk' on primary
>>>>>> storage. This patchdisk is used to pass along 1) the authorized_keys
file
>> and 2) a 'cmdline'
>>>>>> file that describes to the systemvm startup services all of the
>>>>>> various properties of the system vm.
>>>>>>
>>>>>> Example cmdline file:
>>>>>>
>>>>>>   template=domP type=secstorage host=172.17.10.10 port=8250
>>>>>> name=s-1- VM
>>>>>> zone=1 pod=1 guid=s-1-VM
>>>>>> resource=com.cloud.storage.resource.NfsSecondaryStorageResource
>>>>>> instance=SecStorage sslcopy=true role=templateProcessor mtu=1500
>>>>>> eth2ip=192.168.100.170 eth2mask=255.255.255.0
>> gateway=192.168.100.1
>>>>>> public.network.device=eth2 eth0ip=169.254.1.46
>> eth0mask=255.255.0.0
>>>>>> eth1ip=172.17.10.150 eth1mask=255.255.255.0
>> mgmtcidr=172.17.10.0/24
>>>>>> localgw=172.17.10.1 private.network.device=eth1
>>>>>> eth3ip=172.17.10.192
>>>>>> eth3mask=255.255.255.0 storageip=172.17.10.192
>>>>>> storagenetmask=255.255.255.0 storagegateway=172.17.10.1
>>>>>> internaldns1=8.8.4.4 dns1=8.8.8.8
>>>>>>
>>>>>> This patch disk has been bugging me for awhile, as it creates a
>>>>>> volume that isn't really tracked anywhere or known about in
>>>>>> cloudstack's database. Up until recently these would just litter
>>>>>> the KVM primary storages, but there's been some triage done to
>>>>>> attempt to clean them up when the system vms go away. It's not
>>>>>> perfect. It also can be inefficient for certain primary storage
>>>>>> types, for example if you end up creating a bunch of 10MB luns on
a
>> SAN for these.
>>>>>>
>>>>>> So my question goes to those who have been working on the system
>> vm.
>>>>>> My first preference (aside from a full system vm redesign, perhaps
>>>>>> something that is controlled via an API) would be to copy these up
>>>>>> to the system vm via SCP or something. But the cloud services start
>>>>>> so early on that this isn't possible. Next would be to inject them
>>>>>> into the system vm's root disk before starting the server, but if
>>>>>> we're allowing people to make their own system vms, can we count
on
>>>>>> the partitions being what we expect? Also I don't think this will
>>>>>> work for RBD, which qemu directly connects to, with the host OS
>> unaware of any disk.
>>>>>>
>>>>>> Options?
>>>>>
>>>>> Could you take a look at the status of this projects in KVM?
>>>>> http://wiki.qemu.org/Features/QAPI/GuestAgent
>>>>> https://fedoraproject.org/wiki/Features/VirtioSerial
>>>>>
>>>>> Basically, we need a way to talk to guest VM(sending parameters to
>> KVM guest) after VM is booted up. Both VMware/Xenserver has its own way
>> to send parameters to guest VM through PV driver, but there is no such thing
>> for KVM few years ago.

Mime
View raw message