mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: elementwise operator improvements experiments
Date Mon, 17 Nov 2014 10:16:43 GMT
Isn't it true that sparse iteration should always be used for m := f iff

1) the matrix argument is sparse

AND

2) f(0) == 0

?

Why the need for syntactic notation at all?  This property is much easier
to test than commutativity.


On Sun, Nov 16, 2014 at 7:42 AM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> another thing is that optimizer isn't capable of figuring out all
> elementwise fusions in an elementwise expression, e.g. it is not seing
> commutativity rewrites such as
>
> A * B * A
>
> should optimally be computed as sqr(A) * B (it will do it as two pairwise
> operators (A*B)*A).
>
> Bummer.
>
> To do it truly right, it needs to fuse entire elementwise expressions first
> and then optmize them separately.
>
> Ok that's probably too much for now. I am quite ok with writing something
> like -0.5 * (a * a ) for now.
>
> On Sat, Nov 15, 2014 at 10:14 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
>
> > PS
> >
> > actually applying an exponent funciton in place will require addtional
> > underscore it looks. It doesn't want to treat function name as function
> > type in this context for some reason (although it does not require
> partial
> > syntax when used in arguments inside parenthesis):
> >
> > m := exp _
> >
> > Scala is quirky this way i guess
> >
> >
> > On Sat, Nov 15, 2014 at 10:02 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> > wrote:
> >
> >> So i did quick experimentation with elementwise operator improvements:
> >>
> >> (1) stuff like 1 + exp (M):
> >>
> >> (1a): this requires generalization in optimizer for elementwise unary
> >> operators. I've added things like notion if operators require non-zero
> >> iteration only or not.
> >>
> >> (1b): added fusion of elemntwise operators, i.e.
> >>    ew(1+, ew(exp, A)) is rewritten as ew (1+exp, A) for performance
> >> reasons. It still uses an application of a fold over functional monoid,
> but
> >> i think it should be fairly ok performance/DSL trade-off here. to get it
> >> even better, we may add functional assignment syntax to distributed
> >> operands similar to in-memory types as descrbed further down.
> >>
> >> (1c): notion that self elementwise things such as expr1 * expr1 (which
> is
> >> surprisingly often ocurrence, e..g in Torgerson MDS) are rewritten as
> ew(A,
> >> square) etc.
> >>
> >> So that much works. (Note that this also obsoletes dedicated
> >> scalar/matrix elementwise operators that there currently are). Good.
> >>
> >>
> >> The problem here is that (of course!) semantics of the scala language
> has
> >> problem importing something like exp(Double):Double alongside with
> >> exp(DRM):DRM apparently because it doesn't adhere to overloading rules
> >> (different results) so in practice even though it is allowed, one import
> >> overshadows the other.
> >>
> >> Which means, for the sake of DSL we can't have exp(matrix), we have to
> >> name it something else. Unless you see a better solution.
> >>
> >> So ... elementwise naming options:
> >>
> >> Matrix: mexp(m), msqrt(m). msignum(m)
> >> Vector: vexp(v), vsqrt(v)...
> >> DRM: dexp(drm), dsqrt(drm) .... ?
> >>
> >> Let me know what you think.
> >>
> >> (2) Another problem is that actually doing something like 1+exp(m) on
> >> Matrix or Vector types is pretty impractical since, unlike in R (that
> can
> >> count number of bound variables to an object) the semantics requires
> >> creating a clone of m for something like exp(m) to guarantee no side
> >> effects on m itself.
> >>
> >> That is, expression 1 + exp(m) for Matrix or vector types causes 2
> >> clone-copies of original argument.
> >>
> >> actually that's why i use in-place syntax for in-memory types quite
> >> often, something like
> >>
> >>    1+=: (x *= x)  instead of more naturally looking 1+ x * x.
> >>
> >>
> >> But unlike with simple elementwise operators (+=), there's no in-place
> >> modification syntax for a function. We could put an additional
> parameter,
> >> something like
> >>
> >> mexp(m, inPlace=true) but i don't like it too much.
> >>
> >> What i like much more is functional assignment (we already have
> >> assignment to a function (row, col, x) => Double but we can add
> elementwise
> >> function assignment) so that it really looks like
> >>
> >> m := exp
> >>
> >> That is pretty cool.
> >> Except there's a problem of optimality of assignment.
> >>
> >> There are functions here (e.g. abs, sqrt) that don't require full
> >> iteration but rather non-zero iteration only. by default notation
> >>
> >> m := func
> >>
> >> implies dense iteration.
> >>
> >> So what i suggest here is add a new syntax to do sparse iteration
> >> functional assignments:
> >>
> >> m ::= abs
> >>
> >> I actually like it (a lot) because it short and because it allows for
> >> more complex formulas in the same traversal, e.g. proverbial R's
> exp(m)+1
> >> in-place will look
> >>
> >> m := (1 + exp(_))
> >>
> >> So not terrible.
> >>
> >> What it lacks though is automatic determination of composite function
> >> need to apply to all vs. non-zeros only for in-memory types (for
> >> distributed types optimizer tracks this automatically).
> >>
> >> i.e.
> >>
> >> m := abs
> >>
> >> is not optimal (because abs doesn't affect 0s) and
> >>
> >> m ::= (abs(_) + 1)
> >>
> >> is probably also not what one wants (when we have composition of dense
> >> and sparse affecting functions, result is dense affecting function).
> >>
> >> So here thoughts are also welcome. (I am inclining to go with ::= and :=
> >> operators because functional assignment of complex expressions is the
> only
> >> well performing strategy i see here for in-memory types).
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message