cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vogel, Sven" <Sven.Vo...@kupper-computer.com>
Subject AW: nvidia tesla grid card implementation (kvm)
Date Wed, 24 May 2017 22:53:51 GMT
Hi Marc,

thanks for reply. I see the passthrough mode but i need to use it with different virtual machines.


Should I open a jira ticket for implementation?

Thanks

Sven




-----Ursprüngliche Nachricht-----
Von: Marc-Aurèle Brothier [mailto:marco@exoscale.ch] 
Gesendet: Mittwoch, 24. Mai 2017 09:38
An: dev@cloudstack.apache.org
Betreff: Re: nvidia tesla grid card implementation (kvm)

Hi Sven,

We implemented for KVM a pci manager to give a pass through to any PCI devices. We use that
to provision VM with Tesla P100 GPU card, 1 to 4 cards per VM. My code is based on CS 4.4.2
and not the upstream version as we are not following anymore upstream. But I completely got
rid of Xen implementation because the design could only fit Xen server way of dealing with
GPU card. It's not something that I can push upstream. Here is an example of the XML definition
of a VM which has 4 GPU cards:

<devices>
....
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x83' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x86' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07'
function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x87' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08'
function='0x0'/>
    </hostdev>
...
</devices>

You have to map the exact source pci bus of the Nvidia cards from the host in the XML.

It a pass-through, meaning that the full card is given to the VM, therefore it cannot be shared
across multiple VMs.

Hope that helps

Marco

On Mon, May 22, 2017 at 4:22 PM, Vogel, Sven <Sven.Vogel@kupper-computer.com
> wrote:

> Hi Nitin,
>
> thanks for your answer. I mean the xml file for the VM which will 
> always be defined from cloudstack for the kvm hyervisor when we start a machine.
>
> https://access.redhat.com/documentation/en-US/Red_Hat_
> Enterprise_Linux/7/html/Virtualization_Deployment_and_
> Administration_Guide/chap-Guest_virtual_machine_device_
> configuration.html#sect-device-GPU
>
> https://griddownloads.nvidia.com/flex/GRID_4_2_Support_Matrix.pdf
>
> e.g.
>
> <device>
>   <name>pci_0000_02_00_0</name>
>   <path>/sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0</path>
>   <parent>pci_0000_00_03_0</parent>
>   <driver>
>     <name>pci-stub</name>
>   </driver>
>   <capability type='pci'>
>     <domain>0</domain>
>     <bus>2</bus>
>     <slot>0</slot>
>     <function>0</function>
>     <product id='0x11fa'>GK106GL [Quadro K4000]</product>
>     <vendor id='0x10de'>NVIDIA Corporation</vendor>
>         <!-- pay attention to this part -->
>     <iommuGroup number='13'>
>       <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
>       <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>
>     </iommuGroup>
>     <pci-express>
>       <link validity='cap' port='0' speed='8' width='16'/>
>       <link validity='sta' speed='2.5' width='16'/>
>     </pci-express>
>   </capability>
> </device>
>
> I think xen should be supported. What do you think?
>
> Sven Vogel
>
> -----Ursprüngliche Nachricht-----
> Von: Nitin Kumar Maharana [mailto:nitinkumar.maharana@accelerite.com]
> Gesendet: Montag, 22. Mai 2017 16:03
> An: dev@cloudstack.apache.org
> Betreff: Re: nvidia tesla grid card implementation (kvm)
>
> Hi Sven,
>
> Currently the K1 and K2 cards are only supported in XenServer.
> For other cards we have to add support even for XenServer and other 
> hypervisors.
>
> I didn’t understand what is xml defined files you are talking about. 
> Can you please elaborate a little bit?
>
>
> Thanks,
> Nitin
> > On 22-May-2017, at 5:46 PM, Vogel, Sven 
> > <Sven.Vogel@kupper-computer.com>
> wrote:
> >
> > Hi,
> >
> > i saw that in cloudstack nvidia k1 and k2 are implemented. Now there 
> > are
> new cards on market, Tesla M60, Tesla M10 and Tesla M6.
> >
> > It there anybody who can implement it in the xml defined files? How 
> > can
> we help?
> >
> > Thanks for help
> >
> > Sven Vogel
> > Kupper Computer GmbH
>
>
>
>
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which 
> is the property of Accelerite, a Persistent Systems business. It is 
> intended only for the use of the individual or entity to which it is 
> addressed. If you are not the intended recipient, you are not 
> authorized to read, retain, copy, print, distribute or use this 
> message. If you have received this communication in error, please 
> notify the sender and delete all copies of this message. Accelerite, a 
> Persistent Systems business does not accept any liability for virus infected mails.
>
Mime
View raw message