Hello Folks,
I have been trying to implement a tree reduction algorithm recently in
spark but could not find suitable parallel operations. Assuming I have a
general tree like the following 
I have to do the following 
1) Do some computation at each leaf node to get an array of doubles.(This
can be pre computed)
2) For each non leaf node, starting with the root node compute the sum of
these arrays for all child nodes. So to get the array for node B, I need to
get the array for E, which is the sum of G + H.
////////////////////// Start Snippet
case class Node(name: String, children: Array[Node], values: Array[Double])
// read in the tree here
def getSumOfChildren(node: Node) : Array[Double] = {
if(node.isLeafNode) {
return node.values
}
foreach(child in node.children) {
// can use an accumulator here
node.values = (node.values, getSumOfChildren(child)).zipped.map(_+_)
}
node.values
}
////////////////////////// End Snippet
Any pointers to how this can be done in parallel to use all cores will be
greatly appreciated.
Thanks,
Boromir.
