reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Weimer <mar...@weimo.de>
Subject Re: Azure Batch Application Packages
Date Fri, 26 Jan 2018 01:17:26 GMT
We stick per-evaluator information into the JAR files for now. Hence, I
suggest you go with the Storage approach for now. We can revisit if / when
that turns out to become a bottleneck.

Markus

On Thu, Jan 25, 2018 at 11:29 AM, Reed Umbrasas <
ryumbra@microsoft.com.invalid> wrote:

> Hi,
>
> As we are developing Azure Batch runtime for REEF, I was looking into
> what's the best mechanism to submit the shaded JAR to Azure Batch. There
> are two ways:
>
>
>   1.  Use Azure Batch Application packages (https://docs.microsoft.com/
> en-us/azure/batch/batch-application-packages) which is the recommended
> way to submit application files to Azure Batch nodes.
>   2.  Store the JAR in Blob and give the SAS URI to each task as its
> Resource File (https://docs.microsoft.com/en-us/azure/batch/batch-api-
> basics#task).
>
> I am listing some pros and cons below. Essentially, it's a trade-off
> between configuration simplicity and performance. Please let us know your
> thoughts.
>
> Application Packages:
> Pros:
>
>   1.  Their intent exactly matches our use case.
>   2.  Each node will download a given application only once during
> application runtime; if each node runs hundreds or thousands of evaluators
> that translates to time and bandwidth savings.
> Cons:
>
>   1.  Somewhat more complex configuration. In order to create an
> application package in Azure Batch, we'll need to call Batch management
> APIs which require service principal authentication. (Data plane APIs
> require batch key only)
>   2.  Batch imposes 20 application limit per account and 40 version limit
> per application. So we would need to do cleanup work before the Driver
> completes.
>
> Storage SAS URI:
> Pros:
>
>   1.  Simpler configuration - all we need is a storage account name and
> key.
>   2.  No cleanup work necessary.
> Cons:
>
>   1.  Batch will download the JAR file every time a task is run which will
> negatively impact performance.
>
> Thanks,
> Reed
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message