hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Vyas <jayunit...@gmail.com>
Subject Re: FileSystem and FileContext Janitor, at your service !
Date Thu, 06 Mar 2014 16:17:53 GMT
steve you mentioned:

>> but to test YARN it has to be visible across processes.

What do you mean by "test yarn"?   I think for the FileSystem APIs unit
testing, we dont care about YARN, do we?





On Thu, Mar 6, 2014 at 6:02 AM, Steve Loughran <stevel@hortonworks.com>wrote:

> On 5 March 2014 19:07, Jay Vyas <jayunit100@gmail.com> wrote:
>
> > Hi HCFS Community :)
> >
> > This is Jay...  Some of you know me.... I hack on a broad range of file
> > system and hadoop ecosystem interoperability stuff.  I just wanted to
> > introduce myself and let you folks know im going to be working to help
> > clean up the existing unit testing frameworks for the FileSystem and
> > FileContext APIs.  I've listed some bullets below .
> >
> > - byte code inspection based code coverage for file system APIs with a
> tool
> > such as corbertura.
> >
> > - HADOOP-9361 points out that there are many different types of file
> > systems.
> >
> >
> It adds a lot more structure to the tests with an XML declaration of each
> FS (in the -test) JAR.
>
> It's pretty much complete except for some discrepancies between file:// and
> hdfs that I need to fix in file:
> -handling of mkdirs if the destination exists and is a file (currently:
> returns 0)
> -seek() on a closed stream. Currently appears to work,  at least on OS/X.
>
>
> > - Creating mock file systems which can be used to validate API tests,
> which
> > emulate different FS semantics (atomic directory creation, eventual
> > consistency, strict consistency, POSIX compliance, append support,
> etc...)
> >
>
> That's an interesting thought, adding some inconsistency semantics on top
> of an existing FS to emulate blobstore
> behaviour. How would you do this? A in-memory RAM FS could do some of this,
> but to test YARN it has to be visible across processes.
> We'd really need an in-ram simulation of semantics that also offered an RPC
> API of some form.
>
>
>
> >
> > Is anyone interested in the above issues or have any opinions on how /
> > where i should get started?
> >
> > Our end goal is to have a more transparent and portable set of test APIs
> > for the hadoop file system implementors, across the board : so that we
> can
> > all test our individual implementations confidently.
> >
> > So, anywhere i can lend a hand - let me know.  I think this effort will
> > require all of us in the file system community to join forces, and it
> will
> > benefit us all immensly in the long run as well.
> >
> >
> I should do another '9361 patch, once I get those final quirks in file://
> sorted out so that it is consistent with HDFS.
> 1. HDFS is and continues to be, the definition of the semantics of all
> filesystem interfaces.
> 2. It'd be good if we understood more about what accidental features of the
> FS code depends on. e.g. does anything rely on mkdirs() being atomic? Of
> 0x00 being a valid char in a filename? How do programs fail when blocksize
> is too small (try setting it to 1 and see how pig reacts)? How much code
> depends on close() being near-instantaneous and never failing? Blobstores
> do their write then, and can break both these requirements -which is
> something a mock FS could add atop file:
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message