hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Srinivas <sur...@hortonworks.com>
Subject Re: Is there an additional overhead when storing data in HDFS?
Date Wed, 21 Nov 2012 07:14:06 GMT
HDFS uses 4GB for the file + checksum data.

Default is for every 512 bytes of data, 4 bytes of checksum are stored. In
this case additional 32MB data.

On Tue, Nov 20, 2012 at 11:00 PM, WangRamon <ramon_wang@hotmail.com> wrote:

> Hi All
> I'm wondering if there is an additional overhead when storing some data
> into HDFS? For example, I have a 2GB file, the replicate factor of HDSF is
> 2, when the file is uploaded to HDFS, should HDFS use 4GB to store it or
> more then 4GB to store it? If it takes more than 4GB space, why?
> Thanks
> Ramon


View raw message