[ https://issues.apache.org/jira/browse/SYSTEMML2169?page=com.atlassian.jira.plugin.system.issuetabpanels:alltabpanel
]
Matthias Boehm updated SYSTEMML2169:

Description:
The introduction of nary cbind and rbinds in SYSTEMML1986 added support for operations like
{{E = cbind(A,B,C,D)}} which concatenates the matrices A, B, C, D columnwise without the
need for intermediates as requires by traditional binary cbind operations ({{cbind(cbind(cbind(A,B),C),D)}}).
SystemML also provides rewrites to automatically collapse chains of cbind or rbind operations
into their nary counterparts.
However, for distributed spark operations, the binary cbind is still much better optimized
than the nary operation, which only provides a general case operation based on repartition
joins.
This tasks aims to address this by extending {{BuiltinNarySPInstruction}} at runtime level
(i.e., within {{processInstruction}}). Given the unlimited number of inputs, this runtime
approach seems more appropriate than dedicated physical operators at compiler level. In detail,
we need to evaluate if a subset of input fits into the broadcast budget, and if so provide
alternative code path for nary cbind/rbind operations with broadcast joins.
Note that distributed codegen operations have a similar characteristics of unlimited inputs
and already leverage broadcast variables when possible. Hence, we can probably use a similar
approach as done in {{SpoofSPInstruction}}.
was:
The introduction of nary cbind and rbinds in SYSTEMML1986 added support for operations like
{{E = cbind(A,B,C,D)}} which concatenates the matrices A, B, C, D columnwise without the
need for intermediates as requires by traditional binary cbind operations ({{cbind(cbind(cbind(A,B),C),D)}}).
SystemML also provides rewrites to automatically collapse chains of cbind or rbind operations
into their nary counterparts.
However, for distributed spark operations, the binary cbind is still much better optimized
than the nary operation, which only provides a general case operation based on repartition
joins.
This tasks aims to address this by extending {{BuiltinNarySPInstruction}} at runtime level
(i.e., within {{processInstruction}}). Given the unlimited number of inputs, this runtime
approach seems more appropriate than dedicated physical operations at compiler level. In detail,
we need to evaluate if a subset of input fits into the broadcast budget, and if so provide
alternative code path for nary cbind/rbind operations with broadcast joins.
Note that distributed codegen operations have a similar characteristics of unlimited inputs
and already leverage broadcast variables when possible. Hence, we can probably use a similar
approach as done in {{SpoofSPInstruction}}.
> Spark nary cbind/rbind with broadcasts
> 
>
> Key: SYSTEMML2169
> URL: https://issues.apache.org/jira/browse/SYSTEMML2169
> Project: SystemML
> Issue Type: Task
> Reporter: Matthias Boehm
> Priority: Major
> Labels: beginner
>
> The introduction of nary cbind and rbinds in SYSTEMML1986 added support for operations
like {{E = cbind(A,B,C,D)}} which concatenates the matrices A, B, C, D columnwise without
the need for intermediates as requires by traditional binary cbind operations ({{cbind(cbind(cbind(A,B),C),D)}}).
SystemML also provides rewrites to automatically collapse chains of cbind or rbind operations
into their nary counterparts.
> However, for distributed spark operations, the binary cbind is still much better optimized
than the nary operation, which only provides a general case operation based on repartition
joins.
> This tasks aims to address this by extending {{BuiltinNarySPInstruction}} at runtime
level (i.e., within {{processInstruction}}). Given the unlimited number of inputs, this runtime
approach seems more appropriate than dedicated physical operators at compiler level. In detail,
we need to evaluate if a subset of input fits into the broadcast budget, and if so provide
alternative code path for nary cbind/rbind operations with broadcast joins.
> Note that distributed codegen operations have a similar characteristics of unlimited
inputs and already leverage broadcast variables when possible. Hence, we can probably use
a similar approach as done in {{SpoofSPInstruction}}.

This message was sent by Atlassian JIRA
(v7.6.3#76005)
