hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Nauroth <cnaur...@hortonworks.com>
Subject Re: HDFS Federation
Date Thu, 26 May 2016 06:14:05 GMT
You're correct that multi-phase commit protocols seek to address this class of problem.  However,
there is currently no such multi-phase commit protocol implemented within HDFS, and I'd be
reluctant to introduce it at the scale of HDFS.  It definitely would be a very different operational
model for HDFS compared to current state.

The situation with rename across NameNodes is somewhat analogous to renames across different
volumes/different mount points in typical local file systems.  It's typical for the OS to
guarantee atomicity of rename, as long as the rename is performed within the same file system.
 If multiple file systems on different volumes are mounted, and the rename crosses different
file systems, then typically the rename either degrades to a non-atomic copy-delete or the
call simply fails fast.

--Chris Nauroth

From: Kun Ren <ren.hdfs@gmail.com<mailto:ren.hdfs@gmail.com>>
Date: Wednesday, May 25, 2016 at 1:58 PM
To: Chris Nauroth <cnauroth@hortonworks.com<mailto:cnauroth@hortonworks.com>>
Cc: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Re: HDFS Federation

Yes, it is rename. Thanks for the explanation, make sense.

The rename example you gave(across namenodes) is like a distributed transaction,  standard
protocol like two-phase commit can be used to support distributed transactions in Database
world, so is that possible to support the "atomic distributed rename" using something like
two-phase commit protocol? Or even it is possible, the implementation should be very difficult
or the performance is bad?

On Wed, May 25, 2016 at 4:26 PM, Chris Nauroth <cnauroth@hortonworks.com<mailto:cnauroth@hortonworks.com>>
You might be thinking of renames.  For example, when using ViewFs with Federation, the client-side
mount configuration might have path /data1 backed by NameNode1 and path /data2 backed by NameNode2.
 Renaming /data1/file1 to /data2/file1 would not be supported, because it would need to be
copied across NameNodes, and then HDFS would not be able to satisfy its promise that rename
is atomic.  There are more details about this in the ViewFs guide.


--Chris Nauroth

From: Kun Ren <ren.hdfs@gmail.com<mailto:ren.hdfs@gmail.com>>
Date: Wednesday, May 25, 2016 at 11:03 AM
To: Chris Nauroth <cnauroth@hortonworks.com<mailto:cnauroth@hortonworks.com>>
Cc: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Re: HDFS Federation

Thanks a lot, Chris.

I remembered that Federation doesn't support some cross-namenodes operations, if so,  which
operations that Federation doesn't support and why? Thanks again.

On Wed, May 25, 2016 at 12:17 PM, Chris Nauroth <cnauroth@hortonworks.com<mailto:cnauroth@hortonworks.com>>
Hello Kun,

Yes, this command works with federation.  The command would copy the file from NameNode 1/block
pool 1 to NameNode 2/block pool 2.

--Chris Nauroth

From: Kun Ren <ren.hdfs@gmail.com<mailto:ren.hdfs@gmail.com>>
Date: Wednesday, May 25, 2016 at 8:57 AM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: HDFS Federation

Hi Genius,

Does HDFS Federation support the cross namenodes operations?

For example:

./bin/hdfs dfs -cp input1/a.xml input2/b.xml

Supposed that input1 belongs namenode 1, and input 2 belongs namenode 2, does Federation support
this operation? And if not, why?


View raw message