mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anirudh Acharya <>
Subject HPO for MXNet models
Date Wed, 13 Mar 2019 22:43:10 GMT
Hi All,

I posted this earlier on the mxnet slack channel, based on a suggestion
there I am reposting it here for a wider audience -

I was searching for ways of performing HPO for models built with MXNet, and
I came across Sherpa, an open source distributed HPO library presented in
NeurIPS 2018 -

I have been trying it out and it is very easy to use and extensible. It
already supports RandomSearch, Grid Search and BayesianOpt for performing
the search in the hyper-parameter space.

I have submitted a PR with an example gluon use-case -  But I am yet to try it with
large distributed training use cases. But the library does support it, we
can run it in distributed mode for running heavy workloads.

It also comes with a neat UI dashboard to monitor the jobs being run.

[image: Screen Shot 2019-03-13 at 8.08.48 AM.png]

I think we should explore this as an option for performing HPO with gluon.

What might integration entail -
1. I have not fully evaluated what changes might be necessary but I think
the integration can be fairly unobtrusive for both repositories. As
demonstrated above we can already use sherpa for performing HPO, but the
experience is a bit clunky. It can be made smooth by adding a few callback
functions that will track and log the metrics of the different experiment
runs( å la the keras callback function defined here - )

2. The library is developed and maintained by folks in academia and is
published under GPL license. I was given to understand that GPL license
might be a problem for Apache products, but since we are not explicitly
using it within mxnet as a sub-component, I am thinking we might have some
wiggle room there.

MXNet needs HPO functionality and instead of building something from
scratch we could just use existing open source projects. Would like to hear
more from the community.

Anirudh Acharya

  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message