hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alberto Cordioli <cordioli.albe...@gmail.com>
Subject DistributedCache - why not read directly from HDFS?
Date Sat, 23 Mar 2013 14:53:22 GMT
Hi all,

I was not able to find an answer to the following question. If the
question has already been answered please give me the pointer to the
right thread.

Which are actually the differences between read file from HDFS in one
mapper and use DistributedCache.

I saw that with DistributedCache you can give an hdfs path and the
task nodes will get the data on local file system. But which
advantages we have compared with a simple HDFS read with
FSDataInputStream.open() method?

Thank you very much,

Alberto Cordioli

View raw message