From common-dev-return-98979-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Wed Feb 14 03:28:55 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 0B93D180656 for ; Wed, 14 Feb 2018 03:28:54 +0100 (CET) Received: (qmail 55196 invoked by uid 500); 14 Feb 2018 02:28:50 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 55148 invoked by uid 99); 14 Feb 2018 02:28:49 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Feb 2018 02:28:49 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 0AB61C012E; Wed, 14 Feb 2018 02:28:49 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.121 X-Spam-Level: X-Spam-Status: No, score=-0.121 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id nKGmK6oKh4dY; Wed, 14 Feb 2018 02:28:47 +0000 (UTC) Received: from mail-pl0-f51.google.com (mail-pl0-f51.google.com [209.85.160.51]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 60F155F1ED; Wed, 14 Feb 2018 02:28:46 +0000 (UTC) Received: by mail-pl0-f51.google.com with SMTP id k8so7690944pli.8; Tue, 13 Feb 2018 18:28:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=cMXQx3YGPKQ74n7LwcgklU1rDu970kutIYwI1fjn9Kg=; b=lvdXf5U0mOvg3adn2cBn5JdIcXIO/7xFdV3BpRclCrWtLGpgzW2wsjXWaV6BUOjqpy ZUUm2Qtvjr857xROLVAhUAOBYVbWsa4k3OCOlDxDMP+yNtRsTTDpoS4PKxipqfgG8Ixg IfqV5eIC9tLk1LmK4VLjzGUFjxKqMb+JNZ3eRgjF+l+3XcmaDHFuwlrjUpkMH2lyFsW9 3IqMBxdqHBWNL4h4r9tJApywyE3VCPvKPsWym/SG4T/6SIP3Go29EPFjlzEYY/3H93RX jTcjq25FtwV9fCNKWDHBLuJwIT9WPvXFc6V8a//q6LwFBUMZF+RrGR6sqH3frs/BzR1V jC4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=cMXQx3YGPKQ74n7LwcgklU1rDu970kutIYwI1fjn9Kg=; b=iqfgdovywA4LKVmjlh2jUtbBm3Nqb1ariU5CktKQ5jfep9pdartCqxI+NHSLuL8ma/ yCkuQ+dvsUeIika4Mm35j79ssigWJkn6QxISn5B6Ufi2jhUDfj0mGwP53wQvtW6O0pbh /2Zhta5cHNE1VGRscxweg1pcuz70z6G2a7AI1XQYGc73d6FlmcDfRjUNEt82qNX4QI8S 3gc0gvgbLY8n0H11L4ixgH9K/m+qShcdV3mgVD5srKEYgumoZzSVN6ZQi/dkShDXwaEW jy1+DESuAd2VmtIc8k9aBUdlHxmdusuHlpCGRegr1OdlZtEQ6PVlAspLas5ZJhNPmqKG l62A== X-Gm-Message-State: APf1xPC/Iv9f3BX7oB9mK9n5hEHm4XinOYqx2D8P2It7vaEpj9Mqy/o6 Xiz0mOVqvQI4p35J6VJNoR0= X-Google-Smtp-Source: AH8x224zFttMlhGZ3HRfK+TpFMaoREnd3YnCZi0PzKBv0uwjjFNDMydVwH93AppCLqGtd7H0Ne+vtg== X-Received: by 2002:a17:902:4827:: with SMTP id s36-v6mr2988393pld.337.1518575318897; Tue, 13 Feb 2018 18:28:38 -0800 (PST) Received: from [10.22.8.75] ([192.175.27.10]) by smtp.gmail.com with ESMTPSA id c8sm20017486pgn.72.2018.02.13.18.28.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Feb 2018 18:28:38 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk From: sanjay Radia In-Reply-To: <4238B028-4B99-418B-AC8F-5A31D6960EA5@gmail.com> Date: Tue, 13 Feb 2018 18:28:39 -0800 Cc: hdfs-dev , "mapreduce-dev@hadoop.apache.org" , "yarn-dev@hadoop.apache.org" , "common-dev@hadoop.apache.org" Content-Transfer-Encoding: quoted-printable Message-Id: <88F0C29D-7081-4F2C-B266-8C4FC2EFB3E5@gmail.com> References: <4238B028-4B99-418B-AC8F-5A31D6960EA5@gmail.com> To: Yang Weiwei X-Mailer: Apple Mail (2.3273) Sorry the formatting got messed by my email client. Here it is again Dear Hadoop Community Members, We had multiple community discussions, a few meetings in smaller = groups and also jira discussions with respect to this thread. We express = our gratitude for participation and valuable comments.=20 The key questions raised were following 1) How the new block storage layer and OzoneFS benefit HDFS and we were = asked to chalk out a roadmap towards the goal of a scalable namenode = working with the new storage layer 2) We were asked to provide a security design 3)There were questions around stability given ozone brings in a large = body of code. 4) Why can=E2=80=99t they be separate projects forever or merged in when = production ready? We have responded to all the above questions with detailed explanations = and answers on the jira as well as in the discussions. We believe that = should sufficiently address community=E2=80=99s concerns.=20 Please see the summary below: 1) The new code base benefits HDFS scaling and a roadmap has been = provided.=20 Summary: - New block storage layer addresses the scalability of the block = layer. We have shown how existing NN can be connected to the new block = layer and its benefits. We have shown 2 milestones, 1st milestone is = much simpler than 2nd milestone while giving almost the same scaling = benefits. Originally we had proposed simply milestone 2 and the = community felt that removing the FSN/BM lock was was a fair amount of = work and a simpler solution would be useful - We provide a new K-V namespace called Ozone FS with = FileSystem/FileContext plugins to allow the users to use the new system. = BTW Hive and Spark work very well on KV-namespaces on the cloud. This = will facilitate stabilizing the new block layer.=20 - The new block layer has a new netty based protocol engine in the = Datanode which, when stabilized, can be used by the old hdfs block = layer. See details below on sharing of code. 2) Stability impact on the existing HDFS code base and code separation. = The new block layer and the OzoneFS are in modules that are separate = from old HDFS code - currently there are no calls from HDFS into Ozone = except for DN starting the new block layer module if configured to do = so. It does not add instability (the instability argument has been = raised many times). Over time as we share code, we will ensure that the = old HDFS continues to remains stable. (for example we plan to stabilize = the new netty based protocol engine in the new block layer before = sharing it with HDFS=E2=80=99s old block layer) 3) In the short term and medium term, the new system and HDFS will be = used side-by-side by users. Side by-side usage in the short term for = testing and side-by-side in the medium term for actual production use = till the new system has feature parity with old HDFS. During this time, = sharing the DN daemon and admin functions between the two systems is = operationally important: =20 - Sharing DN daemon to avoid additional operational daemon lifecycle = management - Common decommissioning of the daemon and DN: One place to = decommission for a node and its storage. - Replacing failed disks and internal balancing capacity across disks = - this needs to be done for both the current HDFS blocks and the new = block-layer blocks. - Balancer: we would like use the same balancer and provide a common = way to balance and common management of the bandwidth used for balancing - Security configuration setup - reuse existing set up for DNs rather = then a new one for an independent cluster. 4) Need to easily share the block layer code between the two systems = when used side-by-side. Areas where sharing code is desired over time:=20= - Sharing new block layer=E2=80=99s new netty based protocol engine = for old HDFS DNs (a long time sore issue for HDFS block layer).=20 - Shallow data copy from old system to new system is practical only if = within same project and daemon otherwise have to deal with security = setting and coordinations across daemons. Shallow copy is useful as = customer migrate from old to new. - Shared disk scheduling in the future and in the short term have a = single round robin rather than independent round robins. While sharing code across projects is technically possible (anything is = possible in software), it is significantly harder typically requiring = cleaner public apis etc. Sharing within a project though internal APIs = is often simpler (such as the protocol engine that we want to share). 5) Security design, including a threat model and and the solution has = been posted. 6) Temporary Separation and merge later: Several of the comments in the = jira have argued that we temporarily separate the two code bases for now = and then later merge them when the new code is stable: - If there is agreement to merge later, why bother separating now - = there needs to be to be good reasons to separate now. We have addressed = the stability and separation of the new code from existing above. - Merge the new code back into HDFS later will be harder.=20 **The code and goals will diverge further.=20 ** We will be taking on extra work to split and then take extra work = to merge.=20 ** The issues raised today will be raised all the same then. --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-dev-help@hadoop.apache.org