Hi there,
I learned about SystemML and its optimizer from the recent SPOOF paper
. The gist I
absorbed is that SystemML translates linear algebra expressions given by
its DML to relational algebra, then applies standard relational algebra
optimizations, and then re-recognizes the result in linear algebra kernels,
with an attempt to fuse them.
I think I found the SystemML rewrite rules here
.
A couple questions:
1. It appears that SystemML rewrites HOP expressions destructively,
i.e., by throwing away the old expression. In this case, how does SystemML
determine the order of rewrites to apply? Where does cost-based
optimization come into play?
2. Is there a way to "debug/visualize" the optimization process? That
is, when I start with a DML program, can I view (a) the DML program parsed
into HOPs; (b) what rules fire and where in the plan, as well as the plan
after each rule fires; and (c) the lowering and fusing of operators to LOPs?
I know this is a lot to ask for; I'm curious how far SystemML has gone
in this direction.
3. Is there any relationship between the SystemML optimizer and Apache
Calcite ? If not, I'd love to understand
the design decisions that differentiate the two.
Thanks, Dylan Hutchison