hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: In-Memory Reference FS implementations
Date Thu, 06 Mar 2014 21:01:47 GMT
Lets get the HADOOP-9361 stuff in (it lives alongside
FileSystemContractBaseTest) and you can work off that.

On 6 March 2014 18:57, Jay Vyas <jayunit100@gmail.com> wrote:

> Thanks Colin: that's a good example of why we want To unify the hcfs test
> profile.  So how can  hcfs implementations use current hadoop-common tests?
> In mind there are three ways.
> - one solution is to manually cobble together and copy tests , running
> them one by one and seeing which ones apply to their fs.  this is what I
> think we do now (extending base contract, main operations tests, overriding
> some methods, ..).

Yes it is. Start there.

> - another solution is that all hadoop filesystems should conform to one
> exact contract.  Is that a pipe dream? Or is it possible?

No as the nativeFS and hadoop FS
-throw different exceptions
-raise exceptions on seek past end of file at different times (HDFS: on
seek, file:// on read)
-have different illegal filenames (hdfs 2.3+ ".snapshot"). NTFS: "COM1" to
COM9, unless you use the \\.\ unicode prefix
-have different limits on dir size, depth, filename length
-have different case sensitivity

None of these are explicitly in the FileSystem and FileContract APIs, and
nor can they be.

> - a third solution. Is that we could use a declarative API where file
> system implementations declare which tests or groups of tests they don't
> want to run.   That is basically hadoop-9361
it does more,

1.it lets filesystems declare strict vs lax exceptions. Strict: detailed
exceptions, like EOFException. Lax: IOException.
2. by declaring behaviours in an XML file in each filesystems -test.jar,
downstream tests in, say, bigtop, can read in the same details

> - The third approach could be complimented by barebones, simple in-memory
> curated reference implementations that exemplify distilled filesystems with
> certain salient properties (I.e. Non atomic mkdirs)
> > On Mar 6, 2014, at 1:47 PM, Colin McCabe <cmccabe@alumni.cmu.edu> wrote:
> >
> > NetFlix's Apache-licensed S3mper system provides consistency for an
> > S3-backed store.
> > http://techblog.netflix.com/2014/01/s3mper-consistency-in-cloud.html
> >
> > It would be nice to see this or something like it integrated with
> > Hadoop.  I fear that a lot of applications are not ready for eventual
> > consistency, and may never be, leading to the feeling that Hadoop on
> > S3 is buggy.
> >
> > Colin
> >
> >> On Thu, Mar 6, 2014 at 10:42 AM, Jay Vyas <jayunit100@gmail.com> wrote:
> >> do you consider that native S3 FS  a real "reference implementation" for
> >> blob stores? or just something that , by mere chance, we are able to
> use as
> >> a ref. impl.

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message