spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hvanhovell <>
Subject [GitHub] spark pull request: [SPARK-13136][SQL] Create a dedicated Broadcas...
Date Thu, 04 Feb 2016 21:47:30 GMT
GitHub user hvanhovell opened a pull request:

    [SPARK-13136][SQL] Create a dedicated Broadcast exchange operator [WIP]

    Quite a few Spark SQL join operators broadcast one side of the join to all nodes. The
are a few problems with this:
    - This conflates broadcasting (a data exchange) with joining. Data exchanges should be
managed by a different operator.
    - All these nodes implement their own (duplicate) broadcasting logic.
    - Re-use of often used indices is quite hard.
    This PR defines a ```Broadcast``` as a unique kind of ```Distribution```. To match this
distribution we implement a ```Broadcast``` operator and have ```EnsureRequirements``` plan
this operator.
    - [ ] Fix code generation.
    - [ ] Add other broadcasting operators.
    cc @rxin @davies 

You can merge this pull request into a Git repository by running:

    $ git pull SPARK-13136

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11083
commit aa7120e0cd8b40a9d0b3edf7c33f18a530d597bc
Author: Herman van Hovell <>
Date:   2016-02-04T19:13:42Z

    Initial Broadcast design

commit c2b7533f1fb30e9d93856adf4cef4107945670cc
Author: Herman van Hovell <>
Date:   2016-02-04T21:34:34Z

    Fix Exchange and initial code gen attempt.


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message