Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 49902 invoked from network); 10 Mar 2008 19:20:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 10 Mar 2008 19:20:14 -0000 Received: (qmail 9911 invoked by uid 500); 10 Mar 2008 19:20:09 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 9884 invoked by uid 500); 10 Mar 2008 19:20:09 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 9875 invoked by uid 99); 10 Mar 2008 19:20:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Mar 2008 12:20:09 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Mar 2008 19:19:40 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 5B350234C08C for ; Mon, 10 Mar 2008 12:18:46 -0700 (PDT) Message-ID: <74229944.1205176726372.JavaMail.jira@brutus> Date: Mon, 10 Mar 2008 12:18:46 -0700 (PDT) From: "Sanjay Radia (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Issue Comment Edited: (HADOOP-2885) Restructure the hadoop.dfs package In-Reply-To: <18157864.1203719359298.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577032#action_12577032 ] sanjay.radia edited comment on HADOOP-2885 at 3/10/08 12:16 PM: ---------------------------------------------------------------- As far as hadoop goes, the interface is fs.FileSystem. What is the interface of hdfs which implements fs.FileSystem? * fs.hdfs.DistributedFileSystem * fs.hdfs.theProtocol Even though we may consider the above two interfaces to be private, it is worth discussing which of the two interfaces is hdfs's interface. (See my note below about whether these two interfaces are considered publi8c or private). *Analogy* For NFS, the wire protocol is the interface. Proposal 2 would be the most suitable if we consider the HDFS protocol to be the interface. Proposal 1 would also be okay as long hdfs supplies 2 jars, Proposal 1 has the advantage that there can be other impls of the client side wrappers that talk the hdfs protocol. (for example other wrappers could do client side caching while keeping the protocol same). For Posix, libc is the interface. The system calls are like the protocol that libc uses to talk to the kernel. Each new version of posix would ship new impls of libc and the system calls. Apps link dynamically with libc. In a distributed system, distributing a new wrapper to all clients is hard to do since the clients are distributed and do not link dyanamically with the wrapper. Jini for example provides a way for the clients to pull the new wrapper by means of dynamic class loading across the wire (this were heated discussion over this in the java commnunity). We have no plans dynamically load classes across the wire. But none the less, the OS view of its interface is a useful analogy. Proposal 1 would be most suitable for this view. BTW should DistributedFileSystem, DFSClient and the protocol be public or private interfaces? So far I don't see any reason to make any of these public (although we should make sure that the protocol remains compatible over time). was (Author: sanjay.radia): As far as hadoop goes, the interface is fs.FileSystem. What is the interface of hdfs which implements fs.FileSystem? * fs.hdfs.DistributedFileSystem * fs.hdfs.theProtocol Even though we may consider the above two interfaces to be private, it is worth discussing which of the two interfaces is hdfs's interface. *Analogy* For NFS, the wire protocol is the interface. Proposal 2 would be the most suitable if we consider the HDFS protocol to be the interface. Proposal 1 would also be okay as long hdfs supplies 2 jars, Proposal 1 has the advantage that there can be other impls of the client side wrappers that talk the hdfs protocol. (for example other wrappers could do client side caching while keeping the protocol same). For Posix, libc is the interface. The system calls are like the protocol that libc uses to talk to the kernel. Each new version of posix would ship new impls of libc and the system calls. Apps link dynamically with libc. In a distributed system, distributing a new wrapper to all clients is hard to do since the clients are distributed and do not link dyanamically with the wrapper. Jini for example provides a way for the clients to pull the new wrapper by means of dynamic class loading across the wire (this were heated discussion over this in the java commnunity). We have no plans dynamically load classes across the wire. But none the less, the OS view of its interface is a useful analogy. Proposal 1 would be most suitable for this view. > Restructure the hadoop.dfs package > ---------------------------------- > > Key: HADOOP-2885 > URL: https://issues.apache.org/jira/browse/HADOOP-2885 > Project: Hadoop Core > Issue Type: Sub-task > Components: dfs > Reporter: Sanjay Radia > Assignee: Sanjay Radia > Priority: Minor > Fix For: 0.17.0 > > Attachments: Prototype dfs package.png > > > This Jira proposes restructurign the package hadoop.dfs. > 1. Move all server side and internal protocols (NN-DD etc) to hadoop.dfs.server.* > 2. Further breakdown of dfs.server. > - dfs.server.namenode.* > - dfs.server.datanode.* > - dfs.server.balancer.* > - dfs.server.common.* - stuff shared between the various servers > - dfs.protocol.* - internal protocol between DN, NN and Balancer etc. > 3. Client interface: > - hadoop.dfs.DistributedFileSystem.java > - hadoop.dfs.ChecksumDistributedFileSystem.java > - hadoop.dfs.HftpFilesystem.java > - hadoop.dfs.protocol.* - the client side protocol -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.