hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <marko.di...@nissatech.com>
Subject Smaller block size for more intense jobs
Date Tue, 12 May 2015 19:57:36 GMT
Hello,

I'm in doubt should I specify the block size to be smaller than 64MB in case 
that my mappers need to do intensive computations?

I know that it is better to have larger files, since the replication and 
NameNode as a weak point, but I'm don't have that much data, but the 
operations that need to be performed on it are intensive.

It looks like it's better to have smaller block size (at least until there is 
more data) so that multiple Mappers get instantiated, so they could share the 
computations.

I'm currently talking about Hadoop 1, not YARN. But a heads up about the same 
problem with YARN will be appreciated.

Thanks,
Marko

Sent with [inky](http://inky.com?kme=signature)

Mime
View raw message