hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Radia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6223) New improved FileSystem interface for those implementing new files systems.
Date Tue, 15 Sep 2009 00:06:57 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755259#action_12755259
] 

Sanjay Radia commented on HADOOP-6223:
--------------------------------------

@Doug >I do not yet see a case for rename2.
I understand that Owen presented to you the case for rename2 and option 1 in private discussions
last week and that you were not convinced.

Let me summarize the case for option 1 and rename 2 for the benefit of the rest of the community.
Please refer to the options 1 and options 2 above in this Jira. Further note that *both* options
get us to the *same end state*: a new parallel stack where applications call FileContext which
in turn calls AFS-impls. Option 1 uses the existing FileSystem APi and implementation in the
early phase as we migrate from old stack to new stack. As I have mentioned above I have been
going with option1 instead of 2; my patches in HADOOP-4952 have been based on option 1. I
did strongly consider option 2 but felt that it raised the risk in this project (details below).

*Does the FileSystem API have to be enhanced to support  FileContext?*
Yes. If you look at the patches for FileContext (HADOOP-4952) they have added 3 protected
methods: getInitialWorkingDir(), createAbsPerm() mkdirsAbsPerm() (btw in the latest patch
the last two methods were renamed to primitveCreate() and primitiveMkdirs().
These 3 methods were all declared protected and hence not visible to the applications. Once
we have the full new stack, these methods can be deleted.

*What has this got to do with rename2()?*
Turns out that our rename implementation is broken. Fixing the FileSystem#rename spec would
potentially break applications. Given that we are introducing a new fs api (FileContext) it
has been proposed that we leave the old FileSystem#rename & its spec as is and simply
add a new *protected* method FileSystem#rename2() - its sole purpose is to support FileContext#rename
like the other 3 protected methods mentioned above.

*Why did you choose to go with Option 1 and not option 2.*
Option 1 was easier to get started because it leveraged existing FileSystem to the fullest.
 AFS on the other hand was debated as as soon as it started and further the option 2 was questioned.
I felt that the community needed some time to digest this Jira. Comments from 3 folks is very
little in contrast to the large number of comments in FileContext Jira. Further, my intuition
told me that there were a number details to be resolved. The FileSystem design and implementation
are very messy and I didn't want to simply carry forward its design without debate. 

Over the weekend, as I explored option 2 , my intuition was correct: here is a list of issues
to be resolved for AFS. While none of them are impossible to solve, they are not trivial either.

* where should the cache go? In FC or AFS.? Is the cache keyed off the config or not (the
cache is FS seems to be somewhat tied to the config. - I think we need to look at that closely).
The cache has leaked through the FileSytem API - I would like to avoid that for AFS.
* Delete-on-exit - should we raise it to FC or leave it in AFS. There are certain assumptions
made by the current delete-on-exit that seem incorrect and should be revisited.
• What do we do about the public close method?
*  Statistics features in FS. - where does it go in the new world.
Given the above,  I had felt it was wiser to go with option 1 since its only cost is a few
protected methods. Further, even in option 2 these protected methods would have helped us
would have simplified delegation from AFS to FileSystem.

It had always been my goal that as soon as the FileContext was committed I would complete
this AFS  jira and perhaps even switch from option 1 to option 2 midway if there was sufficient
time. 

So far I don't understand the objections to option1 (and to rename2) ; protected methods seems
reasonable in this situation. Is this a style issue? If the objections are minor I feel it
is better to give this AFS jira sufficient time for community discussion and go with option
1. If there are serious objections to Option 1 then by all means lets put all the wood behind
the option 2 arrow.

BTW Option 1 would have been completed by this Friday according to our original plan. Option
2 will not be completed by the freeze date on Friday but we have started work on it.


> New improved FileSystem interface for those implementing new files systems.
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-6223
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6223
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Sanjay Radia
>            Assignee: Sanjay Radia
>
> The FileContext API (HADOOP-4952) provides an improved interface for the application
writer.
> This lets us simplify the FileSystem API since it will no longer need to deal with notions
of default filesystem [ / ],  wd, and config
> defaults for blocksize, replication factor etc. Further it will not need the many overloaded
methods for create() and open() since
> the FileContext API provides that convenience.
> The FileSystem API can be simplified and can now be restricted to those implementing
new file systems.
> This jira proposes that we create new file system API,  and deprecate FileSystem API
after a few releases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message