systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Dusenberry (JIRA)" <>
Subject [jira] [Updated] (SYSTEMML-1554) IPA Scalar Transient Read Replacement
Date Fri, 21 Apr 2017 23:12:04 GMT


Mike Dusenberry updated SYSTEMML-1554:

I'm also attaching a larger example of an actual convolutional net with the statistics w/
and w/o this proposed IPA scalar replacement, as well as w/ and w/o forced {{REMOTE_SPARK}}
constrained parfor loops.

> IPA Scalar Transient Read Replacement
> -------------------------------------
>                 Key: SYSTEMML-1554
>                 URL:
>             Project: SystemML
>          Issue Type: Improvement
>            Reporter: Mike Dusenberry
>         Attachments: parfor_oom_convnet_plan.txt,, parfor_oom_plan.txt,
> Currently, during IPA we collect all variables (scalars & matrices) eligible for
propagation across blocks (i.e. not updated in block), and then propagate the only the matrix
sizes across the blocks.  It seems plausible that we could also replace all eligible scalar
transient reads with literals based on the variables that have already been collected.  The
benefit is that many ops will be able to determine their respective output sizes during regular
compilation, instead of having to wait until dynamic recompilation, and thus we can reduce
the pressure on dynamic recompilation.
> Are there drawbacks to this approach?  The use case is that I was seeing a large number
of memory warnings while training a convolutional net due to the sizes being unknown during
regular compilation, yet the engine only having CP versions of the ops.  Additionally, I was
running into actual heap space OOM errors for situations that should not run out of memory,
and thus I started exploring.
> I've attached an example script and the explain plan (hops & runtime) w/ and w/o
the IPA scalar replacement.

This message was sent by Atlassian JIRA

View raw message