mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Mahler <bmah...@apache.org>
Subject Re: Review Request 48915: Added an example framework for consuming GPUs.
Date Tue, 21 Jun 2016 00:27:27 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48915/#review138716
-----------------------------------------------------------



Hm.. we've tried to generalize the command framework (clumsily called "no executor" framework)
to handle a variety of cases. It looks like we could potentially extend the no_executor_framework
to take capabilities in order to support GPU tasks?

```
./no-executor-framework --command="nvidia-smi && sleep 30" --task_resources="cpus:0.1,gpus:1;mem:32,disk:32"
--capabilities=GPU_RESOURCES
```

This way, you could schedule a long-lived GPU command in a testing cluster, without having
to add/maintain a GPU specific example. For example, if we want to run this GPU framework
in a long-lived manner, then we have to add support for `--num_tasks`. In the past we found
it was a burden to maintain a proliferation of example frameworks so it would be great if
we could leverage the command (aka "no executor") framework here by adding the `--capabilities`
flag.

- Benjamin Mahler


On June 19, 2016, 9:01 p.m., Kevin Klues wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48915/
> -----------------------------------------------------------
> 
> (Updated June 19, 2016, 9:01 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-5649
>     https://issues.apache.org/jira/browse/MESOS-5649
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This framework is designed to show how to build a GPU capable
> framework that can accept offers with GPUs and launch tasks that use
> them. The key thing to remember is that the GPU_RESOURCES capability
> must be set in `FrameworkInfo` in order for a framework to receive
> resource offers from agents that contain GPUs.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am a4931560f1a5b3fbe41ea181477341d3ac459b58 
>   src/examples/gpu_framework.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/48915/diff/
> 
> 
> Testing
> -------
> 
> Run a master and an agent capable of handing out GPUs:
> ```
> $ sudo bin/mesos-master.sh --ip=127.0.0.1 --log_dir=/var/log/mesos --work_dir=/var/lib/mesos
> $ sudo bin/mesos-agent.sh --master=127.0.0.1:5050 --ip=127.0.0.1 --log_dir=/var/log/mesos
--work_dir=/var/lib/mesos
>                           --isolation="cgroups/devices,gpu/nvidia"
> ```
> 
> Run a couple of instances of the framework and verify the correct output:
> ```
> $ ./src/gpu-framework --master=127.0.0.1:5050 --num_gpus=0
> $ ./src/gpu-framework --master=127.0.0.1:5050 --num_gpus=1
> $ ./src/gpu-framework --master=127.0.0.1:5050 --num_gpus=4
> $ ./src/gpu-framework --master=127.0.0.1:5050 --no-allow_gpus --num_gpus=1
> ```
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message