spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gtanguy <>
Subject Re: How does Spark handle RDD via HDFS ?
Date Thu, 10 Apr 2014 10:56:08 GMT
Yes that help to understand better how works spark. But that was also what I
was afraid, I think the network communications will take to much time for my

I will continue to look for a trick in order to not have network

I saw on the hadoop website that : "To minimize global bandwidth consumption
and read latency, HDFS tries to satisfy a read request from a replica that
is closest to the reader. If there exists a replica on the same rack as the
reader node, then that replica is preferred to satisfy the read request"

May if in a way I success to combine a part of spark and some of this, it
could work.

Thank you very much for you answer.


View this message in context:
Sent from the Apache Spark User List mailing list archive at

View raw message