systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Boehm (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SYSTEMML-1444) UDFs w/ single output in expressions
Date Fri, 28 Jul 2017 01:57:00 GMT

    [ https://issues.apache.org/jira/browse/SYSTEMML-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104312#comment-16104312
] 

Matthias Boehm commented on SYSTEMML-1444:
------------------------------------------

great - after thinking about the design a little more, I'd like to recommend to go with approach
(2) that would handle functions with a single output similar to any other hop, while multi-output
functions would use the same mechanism as of today.

In detail, this would entail the following steps (which can be created as subtasks and addressed
via PRs individually):

a) Hop/Lop extensions: Extend the existing {{FunctionOp}} to be used in two modes (single
and multi output). Only in multi-output mode the list of outputs are used (always DAG outputs),
while in single-output mode the {{FunctionOp}} can be used as input to any other HOP and hence
be used in expressions. Besides changing the construction of hops, this also requires some
minor extensions to the lop construction and instruction generation (e.g., using the compiler-provides
name of temporary outputs when generating single-output instructions). At this point, all
{{FunctionOps}} would still be created as multi-output functions at language level.

b) Language changes / tests: According to the change of HOP and LOPs, we can then construct
differently configured HOPs for single-output functions at language level (see DMLTranslator).
In order to use single output functions, we likely also need some changes of validation. This
step should also introduces a couple of tests for functions in expressions.

c) Size propagation and IPA: Having functions in expressions poses a challenge to size propagation
because there is no natural recompilation point after the function call anymore. We should
address this as follows: First, flag dimension-preserving {{FunctionOps}} during {{InterProceduralAnalysis}}
and accordingly modify {{FunctionOp.refreshSizeInformation}} and {{FunctionOp.inferOutputCharacteristics}}
to allow size propagation over {{FunctionOps}} during dynamic recompilation. Second, introduce
a rewrite to split DAGs after {{FunctionOps}} that return matrices/frames and are not dimension-preserving
(see {{RewriteSplitDagDataDependentOperators}} for an example). 

Finally, let's separate the discussion on "structs" (or "tuples") as it's not really related.
We would most likely implemented structs as syntactic sugar at parser level. In contrast,
the discussion above was referring to multi outputs in HOP and LOP DAGs, which is much more
involved.  

> UDFs w/ single output in expressions
> ------------------------------------
>
>                 Key: SYSTEMML-1444
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1444
>             Project: SystemML
>          Issue Type: Sub-task
>          Components: APIs, Compiler, Runtime
>            Reporter: Matthias Boehm
>            Assignee: Janardhan
>             Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message