hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: In-Memory Reference FS implementations
Date Thu, 06 Mar 2014 18:07:12 GMT
On 6 March 2014 16:37, Jay Vyas <jayunit100@gmail.com> wrote:

> As part of HADOOP-9361, im visioning this.
> 1) - We create In Memory FS implementation of different Reference
> FileSystems, each of which specifies appropriate tests , and passes those
> tests , i.e.
>    InMemStrictlyConsistentFS (i.e. hdfs)

HDFS is the filesystem semantics expected by applications -indeed, it is
actually stricter than NFS in terms of its consistency model.

MiniHDFSCluster implements this today -and provides the RPC needed for
forked apps to access it.

For example, here's a test that uses YARN to bring up a forked process
bonded to HDFS mini cluster -a process that then starts HBase instances
talking to HDFS


   InMemEventuallyConsistentFS (blob stores)
>    InMemMinmalFS (a very minimal gaurantee FS, for maybe
> The beauty of this is - it gives us simple, easily testable reference
> implementations that we can base our complex real world file system unit
> tests off of.
I can see the merits of the Blobstore one, so as to demonstrate its

Thinking about it, we are mostly there already, because there's a mock impl
of the org.apache.hadoop.fs.s3native.NativeFileSystemStore interface used
behind the s3n:// class


We could enhance this to give it lower guarantees (AWS-US-east-no
guarantees, US-west: create-consistency), and allow a period of time before
new actions become visible, where actions are: create, delete, overwrite.

We could also allow its methods to take time and maybe fail, so emulating
the storeFile() operation, amongst others. Failure simulation would be

> 2) Then, downstream vendors can just "pick" which of these file systems
> they are most close to, and modify their particular file system to declare
> semantics using the matching FS as a template.
they get to implement an FS that works like HDFS. If the semantics << HDFS,
well, that's not a filesystem, irrespective of what methods it implements.

The blobstore marker interface is intended to cover that, to warn that
"this is not a real filesystem" -a marker applications can use to assert
that it isn't a "FileSystem" by the standard definition of one -and that
all guarantees are lost.

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message