systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Janardhan <>
Subject Re: SYSTEMML-447
Date Fri, 18 May 2018 11:47:35 GMT
Thanks for explaining this. I will go according to this.

- Janardhan

On Fri, May 11, 2018 at 10:14 AM, Matthias Boehm <> wrote:

> This particular JIRA is only partially related. Niketan and Nakul
> worked out the details - the only reason I show up as the reporter is
> that, if I remember correctly, we split a larger scoped JIRA for
> low-level optimizations (GPU, codegen, compression) into individual
> JIRAs and created the detailed tasks.
> Overall, I believe that sparse GPU operations would be very valuable,
> especially in the context of NLP, graphs, and structured data with
> categorical features (which often become very sparse after dummy
> coding) because in these ultra-sparse scenarios dense operations cause
> unnecessary overheads of orders of magnitude (proportional to the
> sparsity). However, creating efficient sparse GPU kernels is
> challenging due to irregularities (e.g., sparsity skew). Compared to
> CPU operations, there might still be benefit depending on the data
> location of inputs/outputs, as well as higher memory bandwidth.
> Even in the face of extending the codegen framework for GPUs (which is
> still on the roadmap for this year), we would still need dense/sparse
> kernels for the individual operations because we want to apply codegen
> only if we can benefit from fusion. Right now we call existing
> libraries such as cuBLAS and cuDNN and have dense kernels for a subset
> of operations such as unary and binary, and unary aggregates.
> Regarding ramping up on the GPU backend, maybe it's a good idea to
> first start with missing dense operations. I'm thinking of statistical
> functions (e.g., covariance, moment), parameterized builtin functions
> (e.g., grouped aggregated), missing unary and binary operations (e.g.,
> bitwise), missing reorg operations (e.g., reshape, sort - there should
> be library for the latter), missing unary, binary and ternary
> aggregates, missing nary (e.g., nary cbind/rbind), etc. Adding these
> remaining operations would also help a lot. However, if you're more
> interested in contributing to the development of sparse kernels, maybe
> you could one or two dense operations, get comfortable, and then move
> on to sparse operations. Apart from the kernels, a seamless support
> for sparse operations would also require some integration work on how
> we pass data, maintain nnz, preallocate sparse outputs, etc.
> Regards,
> Matthias
> On Thu, May 10, 2018 at 8:47 PM, Janardhan <> wrote:
> > Hi Matthias,
> >
> > Was this related to long term plan for GPU codegen?
> >
> > Thank you,
> > Janardhan

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message