reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reed Umbrasas <ryum...@microsoft.com.INVALID>
Subject Azure Batch Application Packages
Date Thu, 25 Jan 2018 19:29:11 GMT
Hi,

As we are developing Azure Batch runtime for REEF, I was looking into what's the best mechanism
to submit the shaded JAR to Azure Batch. There are two ways:


  1.  Use Azure Batch Application packages (https://docs.microsoft.com/en-us/azure/batch/batch-application-packages)
which is the recommended way to submit application files to Azure Batch nodes.
  2.  Store the JAR in Blob and give the SAS URI to each task as its Resource File (https://docs.microsoft.com/en-us/azure/batch/batch-api-basics#task).

I am listing some pros and cons below. Essentially, it's a trade-off between configuration
simplicity and performance. Please let us know your thoughts.

Application Packages:
Pros:

  1.  Their intent exactly matches our use case.
  2.  Each node will download a given application only once during application runtime; if
each node runs hundreds or thousands of evaluators that translates to time and bandwidth savings.
Cons:

  1.  Somewhat more complex configuration. In order to create an application package in Azure
Batch, we'll need to call Batch management APIs which require service principal authentication.
(Data plane APIs require batch key only)
  2.  Batch imposes 20 application limit per account and 40 version limit per application.
So we would need to do cleanup work before the Driver completes.

Storage SAS URI:
Pros:

  1.  Simpler configuration - all we need is a storage account name and key.
  2.  No cleanup work necessary.
Cons:

  1.  Batch will download the JAR file every time a task is run which will negatively impact
performance.

Thanks,
Reed

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message