spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Malouf <malouf.g...@gmail.com>
Subject Re: CoHadoop Papers
Date Tue, 26 Aug 2014 12:20:45 GMT
It appears support for this type of control over block placement is going
out in the next version of HDFS:
https://issues.apache.org/jira/browse/HDFS-2576


On Tue, Aug 26, 2014 at 7:43 AM, Gary Malouf <malouf.gary@gmail.com> wrote:

> One of my colleagues has been questioning me as to why Spark/HDFS makes no
> attempts to try to co-locate related data blocks.  He pointed to this
> paper: http://www.vldb.org/pvldb/vol4/p575-eltabakh.pdf from 2011 on the
> CoHadoop research and the performance improvements it yielded for
> Map/Reduce jobs.
>
> Would leveraging these ideas for writing data from Spark make sense/be
> worthwhile?
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message