hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dhruba Borthakur" <dhr...@gmail.com>
Subject Re: dfs.block.size vs avg block size
Date Sun, 18 May 2008 07:30:25 GMT
There isn's a way to change the block size of an existing file. The
block size of a file can be specified only at the time of file
creation and cannot be changed later.

There isn't any wasted space in your system. If the block size is
128MB but you create a HDFS file of say size 10MB, then that file will
contain one block and that block will occupy only 10MB on HDFS
storage. No space gets wasted.

hope this helps,

On Fri, May 16, 2008 at 4:42 PM, Otis Gospodnetic
<otis_gospodnetic@yahoo.com> wrote:
> Hello,
> I checked the ML archives and the Wiki, as well as the HDFS user guide, but could not
find information about how to change block size of an existing HDFS.
> After running fsck I can see that my avg. block size is 12706144 B (cca 12MB), and that's
a lot smaller than what I have configured: dfs.block.size=67108864 B
> Is the difference between the configured block size and actual (avg) block size results
effectively wasted space?
> If so, is there a way to change the DFS block size and have Hadoop shrink all the existing
> I am OK with not running any jobs on the cluster for a day or two if I can do something
to free up the wasted disk space.
> Thanks,
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

View raw message