hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SP <sajid...@gmail.com>
Subject BLOCK and Split size question
Date Fri, 20 Feb 2015 23:59:50 GMT
Hello Every one,

I have couple of doubts can any one please point me in right direction.

1>What exactly happen when I want to copy 1TB file to Hadoop Cluster using
copyfromlocal command

1> what will be the split size? will it be same as the block size?

2> What is a block and split?


If we have 100 MB file and a block size of 64 MB, As we know it will be
divided into 2 blocks of 64 MB and 36 MB the second block still has 28 MB
of space left what will happen to that free space?
will the cluster have unequal block size or will it be occupied by other
file?


3) let’s say a 64MB block is on node A and replicated among 2 other
nodes(B,C), and the input split size for the map-reduce program is 64MB,
will this split just have location for node A? Or will it have locations
for all the three nodes A,b,C?


4) How is it handled if the Input Split size is greater or lesser than
block size?


can any one please help?

thanks

SP

Mime
View raw message