mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Klues (JIRA)" <>
Subject [jira] [Updated] (MESOS-5555) Change semantics for granting access to /dev/nvidiactl, etc
Date Tue, 07 Jun 2016 21:57:21 GMT


Kevin Klues updated MESOS-5555:
    Summary: Change semantics for granting access to /dev/nvidiactl, etc  (was: Changed semantics
for granting access to /dev/nvidiactl, etc)

> Change semantics for granting access to /dev/nvidiactl, etc
> -----------------------------------------------------------
>                 Key: MESOS-5555
>                 URL:
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Kevin Klues
>            Assignee: Kevin Klues
>              Labels: gpu, mesosphere
> Currently, access to `/dev/nvidiactl` and `/dev/nvidia-uvm` is only granted to / revoked
from a container as GPUs are added and removed from them. On some level, this makes sense
because most jobs don't need access to these devices unless they are also using a GPU. However,
there are cases when access to these files is appropriate, even when not making use of a GPU.
Running `nvidia-smi` to control the global state of the underlying nvidia driver, for example.
> We should add `/dev/nvidiactl` and `/dev/nvidia-uvm` to the default whitelist of devices
to include in every container when the `gpu/nvidia` isolator is enabled. This will allow a
container to run standard nvidia driver tools (such as `nvidia-smi`) without failing with
abnormal errors when no GPUs have been granted to it. As such, these tools will now report
that no GPUs are installed instead of failing abnormally.

This message was sent by Atlassian JIRA

View raw message