cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc-Aurèle Brothier <ma...@exoscale.ch>
Subject Re: nvidia tesla grid card implementation (kvm)
Date Wed, 24 May 2017 07:38:16 GMT
Hi Sven,

We implemented for KVM a pci manager to give a pass through to any PCI
devices. We use that to provision VM with Tesla P100 GPU card, 1 to 4 cards
per VM. My code is based on CS 4.4.2 and not the upstream version as we are
not following anymore upstream. But I completely got rid of Xen
implementation because the design could only fit Xen server way of dealing
with GPU card. It's not something that I can push upstream. Here is an
example of the XML definition of a VM which has 4 GPU cards:

<devices>
....
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x83' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x86' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07'
function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x87' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08'
function='0x0'/>
    </hostdev>
...
</devices>

You have to map the exact source pci bus of the Nvidia cards from the host
in the XML.

It a pass-through, meaning that the full card is given to the VM, therefore
it cannot be shared across multiple VMs.

Hope that helps

Marco

On Mon, May 22, 2017 at 4:22 PM, Vogel, Sven <Sven.Vogel@kupper-computer.com
> wrote:

> Hi Nitin,
>
> thanks for your answer. I mean the xml file for the VM which will always
> be defined from cloudstack for the kvm hyervisor when we start a machine.
>
> https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/7/html/Virtualization_Deployment_and_
> Administration_Guide/chap-Guest_virtual_machine_device_
> configuration.html#sect-device-GPU
>
> https://griddownloads.nvidia.com/flex/GRID_4_2_Support_Matrix.pdf
>
> e.g.
>
> <device>
>   <name>pci_0000_02_00_0</name>
>   <path>/sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0</path>
>   <parent>pci_0000_00_03_0</parent>
>   <driver>
>     <name>pci-stub</name>
>   </driver>
>   <capability type='pci'>
>     <domain>0</domain>
>     <bus>2</bus>
>     <slot>0</slot>
>     <function>0</function>
>     <product id='0x11fa'>GK106GL [Quadro K4000]</product>
>     <vendor id='0x10de'>NVIDIA Corporation</vendor>
>         <!-- pay attention to this part -->
>     <iommuGroup number='13'>
>       <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
>       <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>
>     </iommuGroup>
>     <pci-express>
>       <link validity='cap' port='0' speed='8' width='16'/>
>       <link validity='sta' speed='2.5' width='16'/>
>     </pci-express>
>   </capability>
> </device>
>
> I think xen should be supported. What do you think?
>
> Sven Vogel
>
> -----Ursprüngliche Nachricht-----
> Von: Nitin Kumar Maharana [mailto:nitinkumar.maharana@accelerite.com]
> Gesendet: Montag, 22. Mai 2017 16:03
> An: dev@cloudstack.apache.org
> Betreff: Re: nvidia tesla grid card implementation (kvm)
>
> Hi Sven,
>
> Currently the K1 and K2 cards are only supported in XenServer.
> For other cards we have to add support even for XenServer and other
> hypervisors.
>
> I didn’t understand what is xml defined files you are talking about. Can
> you please elaborate a little bit?
>
>
> Thanks,
> Nitin
> > On 22-May-2017, at 5:46 PM, Vogel, Sven <Sven.Vogel@kupper-computer.com>
> wrote:
> >
> > Hi,
> >
> > i saw that in cloudstack nvidia k1 and k2 are implemented. Now there are
> new cards on market, Tesla M60, Tesla M10 and Tesla M6.
> >
> > It there anybody who can implement it in the xml defined files? How can
> we help?
> >
> > Thanks for help
> >
> > Sven Vogel
> > Kupper Computer GmbH
>
>
>
>
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which is
> the property of Accelerite, a Persistent Systems business. It is intended
> only for the use of the individual or entity to which it is addressed. If
> you are not the intended recipient, you are not authorized to read, retain,
> copy, print, distribute or use this message. If you have received this
> communication in error, please notify the sender and delete all copies of
> this message. Accelerite, a Persistent Systems business does not accept any
> liability for virus infected mails.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message