hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Radia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2885) Restructure the hadoop.dfs package
Date Mon, 10 Mar 2008 15:23:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577032#action_12577032

Sanjay Radia commented on HADOOP-2885:

As far as hadoop goes, the interface is fs.FileSystem.
What is the interface of hdfs which implements fs.FileSystem?
* fs.hdfs.DistributedFileSystem
* fs.hdfs.theProtocol

Even though we may consider the above two interfaces to be private, it is worth discussing
which of the
two interfaces is hdfs's interface. 

For NFS, the wire protocol is the interface.
 Proposal 2 would be the most suitable if we consider the HDFS protocol to be the interface.
Proposal 1 would also
be okay as long hdfs supplies 2 jars, Proposal 1 has the advantage that there can be other
impls of the client side 
wrappers that talk the hdfs protocol. (for example other wrappers could do client side caching
keeping the protocol same). 

For Posix, libc is the interface. The system calls are like the protocol that libc uses to
talk to the kernel.
Each new version of posix would ship new impls of libc and the system calls. Apps link dynamically
with libc.
In a distributed system, distributing a new wrapper to all clients  is hard to do since the

clients are distributed and do not link dyanamically with the wrapper.
Jini for example provides a way for the clients to pull the new wrapper by means
of dynamic class loading across the wire (this were heated discussion over this in the java
We have no plans dynamically load classes  across the wire. But none the less, the OS view
of its
interface is a useful analogy. Proposal 1 would be most suitable for this view.

> Restructure the hadoop.dfs package
> ----------------------------------
>                 Key: HADOOP-2885
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2885
>             Project: Hadoop Core
>          Issue Type: Sub-task
>          Components: dfs
>            Reporter: Sanjay Radia
>            Assignee: Sanjay Radia
>            Priority: Minor
>             Fix For: 0.17.0
>         Attachments: Prototype dfs package.png
> This Jira proposes restructurign the package hadoop.dfs.
> 1. Move all server side and internal protocols (NN-DD etc) to hadoop.dfs.server.*
> 2. Further breakdown of dfs.server.
> - dfs.server.namenode.*
> - dfs.server.datanode.*
> - dfs.server.balancer.*
> - dfs.server.common.* - stuff shared between the various servers
> - dfs.protocol.*  - internal protocol between DN, NN and Balancer etc.
> 3. Client interface:
> - hadoop.dfs.DistributedFileSystem.java
> - hadoop.dfs.ChecksumDistributedFileSystem.java
> - hadoop.dfs.HftpFilesystem.java
> - hadoop.dfs.protocol.* - the client side protocol

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message