hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prashant Kommireddi <prash1...@gmail.com>
Subject Re: Passing data files via the distributed cache
Date Fri, 25 Nov 2011 12:14:28 GMT
I believe you want to ship data to each node in your cluster before MR
begins so the mappers can access files local to their machine. Hadoop
tutorial on YDN has some good info on this.


-Prashant Kommireddi

On Fri, Nov 25, 2011 at 1:05 AM, Andy Doddington <andy@doddington.net>wrote:

> I have a series of mappers that I would like to be passed data using the
> distributed cache mechanism. At the
> moment, I am using HDFS to pass the data, but this seems wasteful to me,
> since they are all reading the same data.
> Is there a piece of example code that shows how data files can be placed
> in the cache and accessed by mappers?
> Thanks,
>        Andy Doddington

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message