hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From haosdent <haosd...@gmail.com>
Subject Re: hsync is too slower than hflush
Date Mon, 26 Aug 2013 02:44:07 GMT
In fact, I just write 4k in every hsync. Datenode would write checksum file and data file when
I hsync data to datanode. Each of them would spent nearly 25ms, so a hsync call would spent
nearly 50ms. But hflush is very fast, which spent both 1ms in write checksum and data. If
a hsync would spent 50ms, what meanings we use it? Or my test way is wrong?

-- 
Best Regards,
Haosong Huang
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Monday, August 26, 2013 at 7:07 AM, Andrew Wang wrote:

> 50ms is believable. hsync makes each DN call fsync and wait for acks, so
> you'd expect at least a disk seek time (~10ms) with some extra time
> depending on how much unsync'd data is being written.
> 
> So, just as some back of the envelope math, assuming a disk that can write
> at 100MB/s:
> 
> 50ms - 10ms seek = 40ms writing time
> 100 MB/s * 40ms = 4MB
> 
> If you're hsync'ing every 4MB, 50ms would be exactly what I'd expect.
> 
> Best,
> Andrew
> 
> 
> On Sat, Aug 24, 2013 at 10:11 PM, haosdent <haosdent@gmail.com (mailto:haosdent@gmail.com)>
wrote:
> 
> > Hi, all. Hadoop support hsync which would call fsync of system after
> > 2.0.2. I have tested the performance of hsync() and hflush() again and
> > again, but I found that the hsync call() everytime would spent nearly 50ms
> > while the hflush call() just spent 2ms. In this slide(
> > http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usagePage
18), the author mentions that hsync() is 2x slower than hflush(). So,
> > is anything wrong? Thank you very much and looking forward to your help.
> > 
> > --
> > Best Regards,
> > Haosong Huang
> > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > 
> 
> 
> 



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message