hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Clampffer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8766) Implement a libhdfs(3) compatible API
Date Wed, 23 Sep 2015 23:21:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905488#comment-14905488

James Clampffer commented on HDFS-8766:

Thanks for the input!

Re: It looks like that the code needs a lot of clean ups. The code needs to follow the Google
C++ styling guide and be formatted using clang-format.
Agreed.  I didn't know we were using automated tools to check the code style, I'll look into
that.  What's the biggest motivation for choosing Google's C++ standards on a new project
like this?  The main thing that bothers me is never using exceptions particularly in the context
of constructors, I'd rather let RAII handle cleanup and avoid lots of checks at the call site.
 I'm not using much RAII at the moment because I'd like to wait until we settle on an object
ownership convention and keep it consistent throughout the library.

Re: I think it's important to have several abstractions. There should be C++ classes that
correspond FileSystem / FileHandle as Bob mentioned.
We should probably spend some time talking about what we want out of the different abstractions
before too much more work is done, my rough ideas for implementing FileSystem / FileHandle
in order of personal preference would be:
1) Turn hdfs_internal into FileSystem and hdfsFile_internal into FileHandle; this is the approach
I started taking here.  It separates the compatibility layer from changes in the underlying
classes and provides simple examples about how to use the lower level async APIs.  A nice
bonus here is that it gives a good opportunity to hide some of the type traits and templates
for people who just want a simple object oriented interface.
2) Take the stuff from compatibility and add it directly to the InputStream/FileSystem objects.
 Less code to maintain but pushes a lot of code that isn't too useful for C++11 programmers
into the main classes.  
3) Implement the FileHandle/FileSystem directly on top of the block_reader, rpc_engine, and
IoService classes.  I'd like to avoid this because it seems like a whole lot of code would
be duplicated.
It seems like a perfect time to begin working on HDFS-9115 as a group so we are all on the
same page.

Re: It should be a generic template to abstract the operations of wrapping async calls to
synchronous calls.
I strongly agree; before I do much more work on this I'm going to write a little macro or
templated class to wrap up async calls.

> Implement a libhdfs(3) compatible API
> -------------------------------------
>                 Key: HDFS-8766
>                 URL: https://issues.apache.org/jira/browse/HDFS-8766
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: James Clampffer
>         Attachments: HDFS-8766.HDFS-8707.000.patch
> Add a synchronous API that is compatible with the hdfs.h header used in libhdfs and libhdfs3.
 This will make it possible for projects using libhdfs/libhdfs3 to relink against libhdfspp
with minimal changes.
> This also provides a pure C interface that can be linked against projects that aren't
built in C++11 mode for various reasons but use the same compiler.  It also allows many other
programming languages to access libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier for programs
built using posix filesystem calls to be modified to access HDFS.

This message was sent by Atlassian JIRA

View raw message