hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rahul Bhattacharjee <rahul.rec....@gmail.com>
Subject Re: Why big block size for HDFS.
Date Mon, 01 Apr 2013 03:03:36 GMT
Thanks a lot John , Azurya.

I guessed about the optimization of HDD. Then it might be good to defrag
the underlying disk during general maintenance downtime.


On Mon, Apr 1, 2013 at 12:28 AM, John Lilley <john.lilley@redpoint.net>wrote:

>  ** **
> *From:* Rahul Bhattacharjee [mailto:rahul.rec.dgp@gmail.com]
> *Subject:* Why big block size for HDFS.****
> ** **
> >Many places it has been written that to avoid huge no of disk seeks , we
> store big blocks in HDFS , so that once we seek to the location , then
> there is only data transfer rate which would be predominant , no more
> seeks. I am not sure if I have understood this correctly.****
> >My question is , no matter what the block size we decide , finally its
> getting written to the computers HDD , which would be formatted and would
> have a block size in KB's and also while writing to the FS (not HDFS) , its
> not guaranteed that the blocks that we write are continuous , so there
> would be disk seeks anyways .The assumption of HDFS would be only true if
> the underlying Fs guarentees to write the data in continuous blocks.****
> >Can someone explain a bit.****
> >Thanks,
> >Rahul  ****
> ** **
> While there are no guarantees that disk storage will be contiguous, the OS
> will attempt to keep large files contiguous (and may even defrag over
> time), and if all files are written using large blocks, this is more likely
> to be the case.  If storage is contiguous, you can write a complete track
> without seeking.  A complete track size varies, but a 1TB disk might have
> 500KB/track.  Stepping adjacent close tracks is also much cheaper than the
> average seek time, and as you might expect, has been optimized in hardware
> to assist sequential I/O.  However, if you switch storage units, you will
> probably encounter at least one full seek at the start of the block (since
> it was probably somewhere else at the time).  The result is that, on
> average, writing sequential files is very fast (>100MB/sec on typical
> SATA).  But I think that the blocks overhead has more to do with finding
> where to read the next block from, assuming that data has been distributed
> evenly.****
> ** **
> So consider connection overhead when the data is distributed.  I am no
> expert on the Hadoop internals, but I suspect that somewhere, a TCP
> connection is opened to transfer data.  Whether connection overhead is
> reduced by maintaining open connection pools, I don’t know.  But let’s
> assume that there is **some** overhead for switching data transfer from
> machine “A”  that owns block “1000” and machine “B” that owns block
> “1001”.  The larger the block size, the less significant will be this
> overhead relative to the sequential transfer rate.  ****
> ** **
> In addition, MapR/YARN has an easier time of scheduling if there are fewer
> blocks.****
> --john****

View raw message