hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From WangRamon <ramon_w...@hotmail.com>
Subject RE: Is there an additional overhead when storing data in HDFS?
Date Wed, 21 Nov 2012 07:21:17 GMT
Thanks, besides the checksum data is there anything else? Data in name node?
 Date: Tue, 20 Nov 2012 23:14:06 -0800
Subject: Re: Is there an additional overhead when storing data in HDFS?
From: suresh@hortonworks.com
To: user@hadoop.apache.org

HDFS uses 4GB for the file + checksum data.
Default is for every 512 bytes of data, 4 bytes of checksum are stored. In this case additional
32MB data.

On Tue, Nov 20, 2012 at 11:00 PM, WangRamon <ramon_wang@hotmail.com> wrote:




Hi All
 
I'm wondering if there is an additional overhead when storing some data into HDFS? For example,
I have a 2GB file, the replicate factor of HDSF is 2, when the file is uploaded to HDFS, should
HDFS use 4GB to store it or more then 4GB to store it? If it takes more than 4GB space, why?

 
Thanks
Ramon 
 		 	   		  


-- 
 http://hortonworks.com/download/


 		 	   		  
Mime
View raw message