hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Rozov" <v.ro...@comcast.net>
Subject Re: MiniDFSCluster
Date Fri, 07 Sep 2012 16:20:02 GMT
Agree. It is a regression in 2.x, though even in 1.x such method as 
corruptBlockOnDataNode() uses "test.build.data" property instead of instance 
data_dir to locate data directory and will possibly corrupt blocks on wrong 
mini cluster if "test.build.data" property is changed after the cluster is 

> This is a regression. for 1.0 you can change dir by setting the relevant 
> data directory "test.build.dir".
In 2.0 it is also possible to use "hdfs.minidfs.basedir" configuration 
option to change base directory. It still does not prevent two clusters to 
be started at the same time that use the same base directory effectively 
invalidating one of them. The best option is to change data directory (for 
example "data", "data1", "data2") for every instance that share the same 
MiniDFSCluster base directory or at minimum fail the second instance if the 
first instance is up and running using the same base directory. I'll file 
new JIRA for voting once I fix HDFS-3892.


-----Original Message----- 
From: Steve Loughran
Sent: Friday, September 07, 2012 6:15 AM
To: hdfs-dev@hadoop.apache.org
Subject: Re: MiniDFSCluster

On 5 September 2012 18:42, Vladimir Rozov <v.rozov@comcast.net> wrote:

> There are few methods on MiniDFSCluster class that are declared as static
> (getBlockFile, getStorageDirPath), though as long as MiniDFSCluster is not
> a singleton they should be instance methods not class methods.

These aren't in 1.x, but new stuff in 2.x, which means that this behaviour
is a regression.

> In my tests I see that starting second instance of MiniDFSCluster
> invalidates the first instance if I don’t change cluster base directory
> (existing data directory is fully deleted), but at the same time static
> declaration of getBlockFile and getStorageDirPath does not allow base
> directory to be changed without affecting functionality.

This is a regression. for 1.0 you can change dir by setting the relevant
data directory "test.build.dir".

I don't see any reason why the static stuff is really needed, it's used in
various tests, but that could be changed -especially as the static methods
aren't in 1.x.

Why not file a JIRA -and perhaps a patch? 

View raw message