aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Khutornenko <ma...@apache.org>
Subject Re: Review Request 47869: Adding support for GPU resource
Date Thu, 26 May 2016 20:18:24 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47869/
-----------------------------------------------------------

(Updated May 26, 2016, 8:18 p.m.)


Review request for Aurora, Joshua Cohen and Stephan Erb.


Changes
-------

I realized this feature has a backwards incompatible nature once any jobs with GPU resource
are created. The scheduler (if rolled back) will be unable to read unknown GPU resource TUnion
values and will fail with nullref in IResource.build(). Unfortunately, there is no graceful
way to handle this migration without operator intervention. Even if we could figure out a
way to drop unknown TUnion elements from resource set, the scheduler would end up in an inconsistent
state with GPU tasks still running in the cluster.

I have added a flag to disable this feature by default and fully documented rollback details.
You can read here: https://github.com/mkhutornenko/incubator-aurora/blob/gpu_changes/RELEASE-NOTES.md


Repository: aurora


Description
-------

This patch adds support for the Mesos GPU resource. While we have to migrate to Mesos 0.29.0
for this change to take effect, nothing prevents us from supporting it end-to-end in Aurora
now.

I have also refactored the `configSummary.html` to display all resources (including ports)
in the same section. The table is populated dinamically with optional resources (GPU, PORTS)
hidden if no values are provided for them.


Diffs (updated)
-----

  RELEASE-NOTES.md 4cbf92e6556d4d84053292e26f65755d971089c0 
  api/src/main/thrift/org/apache/aurora/gen/api.thrift a99889c1f2d9e10825f87ea669532ad78641880f

  docs/features/resource-isolation.md d08b59fe030dd94bad4593c1aea905c2ed66e951 
  docs/reference/configuration.md e77ee603f35744b6bc6d350927a636f1fd2cf552 
  examples/vagrant/mesos_config/etc_mesos-slave/resources 5bfe779fbb98822d0c58dd92e34765c5586946db

  examples/vagrant/upstart/aurora-scheduler.conf 3d9e706de564df5e24cb34265bebc0db1cad11a0

  src/main/java/org/apache/aurora/scheduler/app/AppModule.java c81bd7938c86ed48202d8d464063b64fdd21114e

  src/main/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 221417f63cc812f0ccd6dbe76ff2734d289e7dfb

  src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java 4ef202c17c322ca9800d84da31d0bc6ee832d275

  src/main/java/org/apache/aurora/scheduler/resources/ResourceType.java 6a4f110ff461876ca14c24947f4813d5f2a0dae5

  src/main/python/apache/aurora/config/thrift.py 81a505550314c9c41f00f7c5f5bd9e093b6199c6

  src/main/python/apache/thermos/config/schema_base.py a6768e67189b0560afef844d6b269bed8ada5f2f

  src/main/python/apache/thermos/config/schema_helpers.py 46394bb8ffcd8b75a23d8d3ad2113f4fa1eacad2

  src/main/resources/scheduler/assets/configSummary.html 36df616babf9a391fa3a6b5b4ff0e49ae412ea2d

  src/main/resources/scheduler/assets/js/filters.js 98f786ef50cb8b2e0a086853c639f2d180270e15

  src/test/java/org/apache/aurora/scheduler/configuration/ConfigurationManagerTest.java ddf8143709563effc87a4f5c14c188d826f6dbe8

  src/test/java/org/apache/aurora/scheduler/resources/ResourceManagerTest.java a5dda25a4fbbafba6baa814d28bba96f51049125

  src/test/java/org/apache/aurora/scheduler/thrift/ThriftIT.java 9cce64193549e80f2493d9e58bd84f6fadad6cfd

  src/test/python/apache/aurora/config/test_thrift.py 8769db34f2bb8cc3a52ef5c3ef95b14ee808a57e

  src/test/python/apache/thermos/config/test_schema.py 0440ee525a58f7c1fd79babf0c8a1c8320cde80a

  src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 219c40fb94561f0a390cac16e643bf4332c51aad

  src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 08553e4f48f137e0455ad07287086311171c06bd

  src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora 8b3a50ba6de992560593987f3db254baa4d29a41

  src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh abe0ca75c6a2c0ace15fce68ad0e5c9aa98193a4


Diff: https://reviews.apache.org/r/47869/diff/


Testing
-------

unit and e2e tests


File Attachments
----------------

config_summary.png
  https://reviews.apache.org/media/uploaded/files/2016/05/26/ea0c14e2-f968-4223-8ce0-af942346547b__config_summary.png


Thanks,

Maxim Khutornenko


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message