mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject Share GPU resources via attributes or as custom resources (INTERNAL)
Date Thu, 14 Jan 2016 17:02:08 GMT
I have a machine with 4 GPUs and want to use Mesos+Marathon to schedule the jobs to be run
in the machine. Each job will use maximum 1 GPU and sharing 1 GPU between small jobs would
be ok.
I know Mesos does not directly support GPUs, but it seems I might use custom resources or
attributes to do what I want. But how exactly should this be done?

If I use --attributes="hasGpu:true", would a job be sent to the machine when another job is
already running in the machine (and only using 1 GPU)? I would say all jobs requesting a machine
with a hasGpu attribute would be sent to the machine (as long as it has free CPU and memory
resources). Then, if a job is sent to the machine when the 4 GPUs are already busy, the job
will fail to start, right? Could then Marathon be used to re-send the job after some time,
until it is accepted by the machine?

If I specify --resources="gpu(*):4", it is my understanding that once a job is sent to the
machine, all 4 GPUs will become busy to the eyes of Mesos (even if this is not really true).
If that is right, would this work-around work: specify 4 different resources: gpu:A, gpu:B,
gpu:C and gpu:D; and use constraints in Marathon like this  "constraints": [["gpu", "LIKE",
" [A-D]"]]?


View raw message