systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Boehm (JIRA)" <>
Subject [jira] [Commented] (SYSTEMML-1662) Extended HOP DAG validator
Date Thu, 08 Jun 2017 06:29:18 GMT


Matthias Boehm commented on SYSTEMML-1662:

great [~dhutchis] - let me clarify these one by one. 

* DataOp: A DataOp can be either a persistent read/write or transient read/write - writes
will always have at least one input, but all types can have parameters (e.g., for csv literals
of delimiter, header, etc)
* DataGenOp: A DataGenOp can be rand (or matrix constructor), sequence, and sample - these
operators have different parameters and use a map of parameter type to hop position; it would
be good to check at least consistency between the number of map entries and number of inputs.
* ReorgOp: In general, these operators have one input (e.g., transpose, diag, rev), but there
are certain operators - specifically, sort (i.e., order), and reshape - which take additional
parameters such as the order by column and target dimensions. 
* TernaryOp: Yes, generally, these operators have three operands, with the exception of ctable,
which takes target dimensions (for padding and pruning).
* QuaternaryOp: Similarly, QuaternaryOps can have three or four inputs.
* SpoofFusedOp: Yes, in general, fused operators can have one or more inputs. However, specific
types of fused operators have specific constraints - for example the OuterProduct template
type will always have at least three inputs (sparse driver, two matrices for outer product
like matrix multiply).

So maybe it's a good idea to create a method that returns if the number of inputs is correct
instead of the expected number of inputs - this way, we can easily check for consistency within
each hop in an operation-specific manner.

> Extended HOP DAG validator
> --------------------------
>                 Key: SYSTEMML-1662
>                 URL:
>             Project: SystemML
>          Issue Type: Sub-task
>          Components: Compiler
>            Reporter: Matthias Boehm
>              Labels: beginner
>             Fix For: SystemML 1.0
> This task aims to extend the existing HOP DAG validator (see {{org.apache.sysml.hops.rewrite.HopDagValidator}},
which can be enabled via {{org.apache.sysml.hops.rewriteProgramRewriter.CHECK}}) in various
ways in order to provide better developer tooling for checking the correctness of new and
existing rewrites.
> So far, this validator, checks only for:
> * Correct parent node linking
> * Correct child node linking
> * Non-empty children (for all hops other than {{DataOp}} and {{LiteralOp}})
> Possible extensions include (but are not limited to):
> * Correct HOP output data types
> * Correct HOP output value types
> * Correct number of expected child nodes
> * Correct output size information wrt input sizes
> * Correct visit status 
> These extensions would be very useful for multiple reasons. First, they would detect
rewrite issues early on in the development process. This is important because rewrite issues
usually lead to strange and non-obvious behavior of real application scripts. Second, the
HOP DAG validator provides a systematic way of debugging optimizer issues. The intended future
workflow is as follows:
> * 1. Disable rewrites via optimization level 1 to determine if rewrites are the issue.
> * 2. Use the extended {{HopDagValidator}} validator to find the source of corruption.
> * 3. If (2) did not find the issue, resort to low-level debugging, and extend the {{HopDagValidator}}
to capture the root cause of the issue.

This message was sent by Atlassian JIRA

View raw message