hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radhe Radhe <radhe.krishna.ra...@live.com>
Subject Streaming data access in HDFS: Design Feature
Date Wed, 05 Mar 2014 08:08:52 GMT
Hello All,

Can anyone please explain what we mean by Streaming data access in HDFS.

Data is usually copied to HDFS and in HDFS the data is splitted across DataNodes in blocks.
Say for example, I have an input file of 10240 MB(10 GB) in size and a block size of 64 MB.
Then there will be 160 blocks.
These blocks will be distributed across DataNodes in blocks.
Now the Mappers will read data from these DataNodes keeping the data locality feature in mind(i.e.
blocks local to a DataNode will be read by the map tasks running in that DataNode).

Can you please point me where is the "Streaming data access in HDFS" is coming into picture

View raw message