hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghu Angadi <rang...@yahoo-inc.com>
Subject Re: minor change in dataNode handling of multiple directories.
Date Thu, 30 Nov 2006 18:51:47 GMT

Yes, we will retain the existing behavior.


Konstantin Shvachko wrote:
> Good point.
> I think we should document it (Javadoc?) making it a feature rather than 
> a side effect.
> Bryan A. P. Pendleton wrote:
>> I would prefer this proposal not be implements. The current way things 
>> work
>> makes it possible to configure, centrally, a list of all directories that
>> _could_ be used for storage. Since there's no easy way to do per-node
>> configurations (nor would it be desirable, IMO, in this case), the
>> directories config ends up being the list of all possibly usable
>> directories. Many of my cluster nodes are configured using 
>> "rocksclusters":
>> they will have a uniform set of mounts created, one for each physical 
>> drive,
>> at boot/re-install. If I specify in my config the list of all 
>> directories up
>> to the most number of drives a machine will ever have, then I get easy
>> drop-in use, regardless of variations in nodes in the cluster. I have 
>> been
>> relying in the current behavior to keep me sane.
>> OTOH, I wouldn't oppose making this the default behavior, with a
>> configuration param that would set things back to the old behavior.
>> On 11/29/06, Raghu Angadi <rangadi@yahoo-inc.com> wrote:
>>> As part of the "Version upgrade" related changes, thinking of strictly
>>> requiring that datanode be able to lock _all_ the configured directories
>>> instead of any one of them.
>>> Currently if multiple data directories are specified for a datanode, it
>>> tries to lock a file is in each of the directories. If it fails to lock
>>> some of the directories, it will use the directories that it could.
>>> Looks like this flexibility was included mainly for convenience in
>>> config file.
>>> This might not affect anyone, let us know of your opinions.
>>> Note that all directories have the same storage id. So each individual
>>> directory is not complete by itself but a part of one storage.
>>> Raghu.

View raw message