hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Young Kim <juneng...@gmail.com>
Subject what's the differences between file.blocksize and dfs.blocksize in a job.xml?
Date Wed, 09 Mar 2011 11:57:56 GMT
hi,

I am wondering the concepts of file.blocksize and dfs.blocksize.

in hdfs-site.xml, I set
<property>
<name>dfs.block.size</name>
<value>536870912</value>
<final>true</final>
</property>

in job.xml, I found
*file.blocksize* 	67108864


*dfs.blocksize* 	536870912


dfs browser's page>

*Name*
	*Type*
	*Size*
	*Replication*
	*Block Size*
	*Modification Time*
	*Permission*
	*Owner*
	*Group*
*20110309160005 
<http://thadps06.scast.nhnsystem.com:50075/browseDirectory.jsp?dir=%2Fuser%2Firteam%2F20110309160005&namenodeInfoPort=50070&delegation=null>*
	*dir*
	
	
	
	*2011-03-09 16:51*
	*rwxr-xr-x*
	*test*
	*supergroup*
*all0307.ep 
<http://thadps06.scast.nhnsystem.com:50075/browseDirectory.jsp?dir=%2Fuser%2Firteam%2Fall0307.ep&namenodeInfoPort=50070&delegation=null>*
	*file*
	*21.53 GB*
	*2*
	*64 MB*
	*2011-03-09 15:58*
	*rw-r--r--*
	*test*
	*supergroup*
*all0307.svc 
<http://thadps06.scast.nhnsystem.com:50075/browseDirectory.jsp?dir=%2Fuser%2Firteam%2Fall0307.svc&namenodeInfoPort=50070&delegation=null>*
	*file*
	*21.53 GB*
	*2*
	*64 MB*
	*2011-03-09 15:13*
	*rw-r--r--*
	*test*
	*supergroup*



total size of inputs of a job is about 44GB(all0307.ep + all0307.svc).
in the step of maping, the split's numbers are 690. (that means a map 
task took a single block size as 64MB).

I thought the splits counts should be about 88 because a single block 
size is 512MB and input file's size are 44GB).

How could I get the result I want?

thanks.

-- 
Junyoung Kim (juneng603@gmail.com)


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message