spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reynold Xin <>
Subject Re: Network Communication - Akka or more?
Date Wed, 17 Sep 2014 07:06:16 GMT
I'm not familiar with Infiniband, but I can chime in on the Spark part.

There are two kinds of communications in Spark: control plane and data
plane.  Task scheduling / dispatching is control, whereas fetching a block
(e.g. shuffle) is data.

On Tue, Sep 16, 2014 at 4:22 PM, Trident <> wrote:

> Thank you for reading this mail.
> I'm trying to change the underlying network connection system of Spark to
> support Infiniteband.
> 1. I doubt whether ConnectionManager and netty is under construction. It
> seems that they are not usually used.

They are used for data plane communication. Broadcast, shuffle, all use

> 2. How much connection payload is carried by akka?

Akka is mainly responsible for control, i.e. dispatching tasks, reporting a
block being put into memory to the driver etc.

> 3. When running ./bin/run-example SparkPi   I noticed that the jar file
> has been sent from server to client. It is scary because the jar is big. Is
> it common?

How are you going to distribute the jar file if you don't send it? The
workers need to bytecode for those classes you are going to execute.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message