hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghu Angadi <rang...@yahoo-inc.com>
Subject Re: Datanode block scans
Date Thu, 13 Nov 2008 17:32:16 GMT
Brian Bockelman wrote:
> Hey all,
> I noticed that the maximum throttle for the datanode block scanner is 
> hardcoded at 8MB/s.
> I think this is insufficient; on a fully loaded Sun Thumper, a full scan 
> at 8MB/s would take something like 70 days.
> Is it possible to make this throttle a bit smarter?  At the very least, 
> would anyone object to a patch which exposed this throttle as a config 
> option?  Alternately, a smarter idea would be to throttle the block 
> scanner at (8MB/s) * (# of volumes), under the assumption that there is 
> at least 1 disk per volume.

Making the max configurable seems useful. Either of the above options is 
fine, though the first one might be simpler for configuration.

8MB/s is calculated for around 4TB of data on a node. given 80k seconds 
a day, it is around 6-7 days. 8-10 MB/s is not too bad a load on 2-4 
disk machine.

> Hm... on second thought, however trivial the resulting disk I/O would 
> be, on the Thumper example, the maximum throttle would be 3Gbps: that's 
> a nontrivial load on the bus.
> How do other "big sites" handle this?  We're currently at 110TB raw, are 
> considering converting ~240TB over from another file system, and are 
> planning to grow to 800TB during 2009.  A quick calculation shows that 
> to do a weekly scan at that size, we're talking ~10Gbps of sustained reads.

You have a 110 TB on single datanode and moving to 800TB nodes? Note 
that this rate applies to amount of data on a single datanode.


> I still worry that the rate is too low; if we have a suspicious node, or 
> users report a problematic file, waiting a week for a full scan is too 
> long.  I've asked a student to implement a tool which can trigger a full 
> block scan of a path (the idea would be able to do "hadoop fsck 
> /path/to/file -deep").  What would be the best approach for him to take 
> to initiate a high-rate "full volume" or "full datanode" scan?

View raw message