spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mridul Muralidharan (JIRA)" <>
Subject [jira] [Commented] (SPARK-3019) Pluggable block transfer (data plane communication) interface
Date Sun, 17 Aug 2014 08:13:19 GMT


Mridul Muralidharan commented on SPARK-3019:

I am yet to go through the proposal in detail so will defer comments on that for later; but
to get some clarity on discussion around Sandy's point :

- Until we read from all mappers, shuffle cant actually start.
Even if a single mapper's output is small enough to fit into memory (which it need not); num_mappers
* avg_size_of_map_output_per_reducer could be way larger than available memory by orders.
(This is fairly common for us for example).
This was the reason we actually worked on 2G fix btw - individual blocks in a mapper and also
the data per reducer for a mapper was larger than 2G :-)

- While reading data off network, we cannot make an assessment if the read data can fit into
memory or not (since there are other parallel read requests pending for this and other cores
in the same executor).
So spooling intermediate data to disk would become necessary at both mapper side (which it
already does) and at reducer side (which we dont do currently - assume that a block can fit
into reducer memory as part of doing a remote fetch). This becomes more relevant when we want
to target bigger blocks of data and tackle skew in data (for shuffle)

> Pluggable block transfer (data plane communication) interface
> -------------------------------------------------------------
>                 Key: SPARK-3019
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle, Spark Core
>            Reporter: Reynold Xin
>            Assignee: Reynold Xin
>         Attachments: PluggableBlockTransferServiceProposalforSpark - draft 1.pdf
> The attached design doc proposes a standard interface for block transferring, which will
make future engineering of this functionality easier, allowing the Spark community to provide
alternative implementations.
> Block transferring is a critical function in Spark. All of the following depend on it:
> * shuffle
> * torrent broadcast
> * block replication in BlockManager
> * remote block reads for tasks scheduled without locality

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message