systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Glenn Weidner (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SYSTEMML-1727) Wrong mvvar instruction compilation for persistent writes
Date Sat, 09 Sep 2017 01:53:00 GMT

     [ https://issues.apache.org/jira/browse/SYSTEMML-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Glenn Weidner updated SYSTEMML-1727:
------------------------------------
    Fix Version/s:     (was: SystemML 1.0)
                   SystemML 0.15

> Wrong mvvar instruction compilation for persistent writes
> ---------------------------------------------------------
>
>                 Key: SYSTEMML-1727
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1727
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: Matthias Boehm
>            Assignee: Matthias Boehm
>             Fix For: SystemML 0.15
>
>
> Currently, we compile persistent writes in binary format that read from transient reads
to mvvar instructions, which are supposed to be meta data operations on HDFS. However, this
comes with two fundamental problems:
> * In case of different file URI schemes between scratch space and persistent write location,
we cannot use a rename at all, requiring us to read and write the matrix explicitly. For large
data this ultimately leads to OOMs.
> * For scripts where intermediates are fed into such persistent writes but subsequently
used by other operations, this can lead to problem of missing inputs because the intermediate
does no longer exist under the given temporary filename.
> An example where scripts fail for the second reason is given below:
> {code}
> PROGRAM
> --MAIN PROGRAM
> ----GENERIC (lines 1-1) [recompile=false]
> ------(8) dg(rand) [1000000,1000,1000,1000,1000000000] [0,0,7629 -> 7629MB], CP
> ------(9) TWrite X (8) [1000000,1000,1000,1000,1000000000] [7629,0,0 -> 7629MB], CP
> ----GENERIC (lines 5-5) [recompile=false]
> ----GENERIC (lines 9-9) [recompile=false]
> ------(17) TRead X [1000000,1000,1000,1000,1000000000] [0,0,7629 -> 7629MB], CP
> ------(20) PWrite X (17) [1000000,1000,1000,1000,1000000000] [7629,0,0 -> 7629MB],
CP
> ----GENERIC (lines 13-13) [recompile=false]
> ------(24) TRead X [1000000,1000,1000,1000,1000000000] [0,0,7629 -> 7629MB], CP
> ------(26) b(+) (24) [1000000,1000,1000,1000,-1] [7629,0,7629 -> 15259MB], CP
> ------(27) ua(+RC) (26) [0,0,-1,-1,-1] [7629,0,0 -> 7629MB], CP
> ------(28) u(print) (27) [-1,-1,-1,-1,-1] [0,0,0 -> 0MB]
> {code}
> This task aims to fix both related issues by reworking the generation of rmvar instructions
in favor of explicit write instructions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message