Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 55A6011C6F for ; Tue, 23 Sep 2014 23:56:20 +0000 (UTC) Received: (qmail 47886 invoked by uid 500); 23 Sep 2014 23:56:19 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 47787 invoked by uid 500); 23 Sep 2014 23:56:19 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 47776 invoked by uid 99); 23 Sep 2014 23:56:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Sep 2014 23:56:19 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of aagarwal@hortonworks.com designates 209.85.160.177 as permitted sender) Received: from [209.85.160.177] (HELO mail-yk0-f177.google.com) (209.85.160.177) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Sep 2014 23:56:15 +0000 Received: by mail-yk0-f177.google.com with SMTP id 9so2380654ykp.8 for ; Tue, 23 Sep 2014 16:55:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=MuydY01Z+23hpHYVMJFrgqWWzm7WXUY7kMLHx7q0Wfo=; b=UBfBlAsXF/IeDSDay2wDyHOLzK1CkEvLjvXYIIgkS8tfRF60gq2/T/aJaQXJBzWids XrG1fv5IPyF8xcYwaf7W43Ykc/DSgoqWodNoxgRLY6T9u80DdQZpoyEyR3jtt2E4v0RV K/imPhARBfOjoCOyxo57JHHPAUCK1L6m+6B9UCw4q02d8t+US6CLXyhaa5VnteaIZWKh gQcEGKEyflDJMb+ZYM1j2jP+UpTbDh7SEounQpBxrtA2TUZfM4wnzZPTCTa5MnO3kior lHJG+VSHjnIj6o1yWp2OLMxkRTLC9ELCZTSNfkTWL6N1hWi9S/98ypGCWwQQmuQyHEsj YqPA== X-Gm-Message-State: ALoCoQkK4rYfo9XxUC+pyB6GxmlKMLfxyPSnf+I4oJ+BS37sikEtOmoOkOBveh4+h3klwwF2WTcMYhtoUkrrRLrpayKPgPs7vsW2BDOlGEWTO/CJUG2H6J+wJ/fnlK5L/tZKdHW7tL9G MIME-Version: 1.0 X-Received: by 10.236.148.112 with SMTP id u76mr385113yhj.116.1411516554088; Tue, 23 Sep 2014 16:55:54 -0700 (PDT) Received: by 10.170.158.139 with HTTP; Tue, 23 Sep 2014 16:55:54 -0700 (PDT) In-Reply-To: References: Date: Tue, 23 Sep 2014 16:55:54 -0700 Message-ID: Subject: Re: [VOTE] Merge HDFS-6581 to trunk - Writing to replicas in memory. From: Arpit Agarwal To: "hdfs-dev@hadoop.apache.org" Content-Type: multipart/alternative; boundary=20cf303a3107cd2b6f0503c44e1f X-Virus-Checked: Checked by ClamAV on apache.org --20cf303a3107cd2b6f0503c44e1f Content-Type: text/plain; charset=UTF-8 I have posted write benchmark results to the Jira. On Tue, Sep 23, 2014 at 3:41 PM, Arpit Agarwal wrote: > Hi Andrew, I said "it is not going to be a substantial fraction of memory > bandwidth". That is certainly not the same as saying it won't be good or > there won't be any improvement. > > Any time you have transfers over RPC or the network stack you will not get > close to the memory bandwidth even for intra-host transfers. > > I'll add some micro-benchmark results to the Jira shortly. > > Thanks, > Arpit > > On Tue, Sep 23, 2014 at 2:33 PM, Andrew Wang > wrote: > >> Hi Arpit, >> >> Here is the comment. It was certainly not my intention to misquote anyone. >> >> >> https://issues.apache.org/jira/browse/HDFS-6581?focusedCommentId=14138223&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14138223 >> >> Quote: >> >> It would be nice to see that would could get a substantial fraction of >> memory bandwidth when writing to a single replica in-memory. >> >> The comparison will be interesting but I can tell you without measurement >> it is not going to be a substantial fraction of memory bandwidth. We are >> still going through DataTransferProtocol with all the copies and overhead >> that involves. >> >> When the goal is in-memory writes and we are unable to achieve a >> substantial fraction of memory bandwidth, to me that is "not good >> performance." >> >> I also looked through the subtasks, and AFAICT the only one related to >> improving this is deferring checksum computation. The benchmarking we did >> on HDFS-4949 showed that this only really helps when you're down to single >> copy or zero copies with SCR/ZCR. DTP reads didn't see much of an >> improvement, so I'd guess the same would be true for DTP writes. >> >> I think my above three questions are still open, as well as my question >> about why we're merging now, as opposed to when the performance of the >> branch is proven out. >> >> Thanks, >> Andrew >> >> On Tue, Sep 23, 2014 at 2:10 PM, Arpit Agarwal >> wrote: >> >> > Andrew, don't misquote me. Can you link the comment where I said >> > performance wasn't going to be good? >> > >> > I will add some add some preliminary write results to the Jira later >> today. >> > >> > > What's the plan to improve write performance? >> > I described this in response to your and Colin's comments on the Jira. >> > >> > For the benefit of folks not following the Jira, the immediate task we'd >> > like to get done post-merge is moving checksum computation off the write >> > path. Also see open subtasks of HDFS-6581 for other planned perf >> > improvements. >> > >> > Thanks, >> > Arpit >> > >> > >> > On Tue, Sep 23, 2014 at 1:07 PM, Andrew Wang >> > wrote: >> > >> > > Hi Arpit, >> > > >> > > On HDFS-6581, I asked for write benchmarks on Sep 19th, and you >> responded >> > > that the performance wasn't going to be good. However, I thought the >> > > primary goal of this JIRA was to improve write performance, and write >> > > performance is listed as the first feature requirement in the design >> doc. >> > > >> > > So, this leads me to a few questions, which I also asked last week on >> the >> > > JIRA (I believe still unanswered): >> > > >> > > - What's the plan to improve write performance? >> > > - What kind of performance can we expect after the plan is completed? >> > > - Can this expected performance be validated with a prototype? >> > > >> > > Even with these questions answered, I don't understand the need to >> merge >> > > this before the write optimization work is completed. Write perf is >> > listed >> > > as a feature requirement, so the branch can reasonably be called not >> > > feature complete until it's shown to be faster. >> > > >> > > Thanks, >> > > Andrew >> > > >> > > On Tue, Sep 23, 2014 at 11:47 AM, Jitendra Pandey < >> > > jitendra@hortonworks.com> >> > > wrote: >> > > >> > > > +1. I have reviewed most of the code in the branch, and I think its >> > ready >> > > > to be merged to trunk. >> > > > >> > > > >> > > > On Mon, Sep 22, 2014 at 5:24 PM, Arpit Agarwal < >> > aagarwal@hortonworks.com >> > > > >> > > > wrote: >> > > > >> > > > > HDFS Devs, >> > > > > >> > > > > We propose merging the HDFS-6581 development branch to trunk. >> > > > > >> > > > > The work adds support to write to HDFS blocks in memory. The >> target >> > use >> > > > > case covers applications writing relatively small, intermediate >> data >> > > sets >> > > > > with low latency. We introduce a new CreateFlag for the existing >> > > > CreateFile >> > > > > API. HDFS will subsequently attempt to place replicas of file >> blocks >> > in >> > > > > local memory with disk writes occurring off the hot path. The >> current >> > > > > design is a simplification of original ideas from Sanjay Radia on >> > > > > HDFS-5851. >> > > > > >> > > > > Key goals of the feature were minimal API changes to reduce >> > application >> > > > > burden and best effort data durability. The feature is optional >> and >> > > > > requires appropriate DN configuration from administrators. >> > > > > >> > > > > Design doc: >> > > > > >> > > > > >> > > > >> > > >> > >> https://issues.apache.org/jira/secure/attachment/12661926/HDFSWriteableReplicasInMemory.pdf >> > > > > >> > > > > Test plan: >> > > > > >> > > > > >> > > > >> > > >> > >> https://issues.apache.org/jira/secure/attachment/12669452/Test-Plan-for-HDFS-6581-Memory-Storage.pdf >> > > > > >> > > > > There are 28 resolved sub-tasks under HDFS-6581, 3 open tasks for >> > > > > tests+Jenkins issues and 7 open subtasks tracking planned >> > > improvements. >> > > > > The latest merge patch is 3300 lines of changed code of which 1300 >> > > lines >> > > > is >> > > > > new and updated tests. Merging the branch to trunk will allow HDFS >> > > > > applications to start evaluating the feature. We will continue >> work >> > on >> > > > > documentation, performance tuning and metrics in parallel with the >> > vote >> > > > and >> > > > > post-merge. >> > > > > >> > > > > Contributors to design and code include Xiaoyu Yao, Sanjay Radia, >> > > > Jitendra >> > > > > Pandey, Tassapol Athiapinya, Gopal V, Bikas Saha, Vikram Dixit, >> > Suresh >> > > > > Srinivas and Chris Nauroth. >> > > > > >> > > > > Thanks to Haohui Mai, Colin Patrick McCabe, Andrew Wang, Todd >> Lipcon, >> > > > Eric >> > > > > Baldeschwieler and Vinayakumar B for providing useful feedback on >> > > > > HDFS-6581, HDFS-5851 and sub-tasks. >> > > > > >> > > > > The vote runs for the usual 7 days and will expire at 12am PDT on >> Sep >> > > 30. >> > > > > Here is my +1 for the merge. >> > > > > >> > > > > Regards, >> > > > > Arpit >> > > > > >> > > > > -- >> > > > > CONFIDENTIALITY NOTICE >> > > > > NOTICE: This message is intended for the use of the individual or >> > > entity >> > > > to >> > > > > which it is addressed and may contain information that is >> > confidential, >> > > > > privileged and exempt from disclosure under applicable law. If the >> > > reader >> > > > > of this message is not the intended recipient, you are hereby >> > notified >> > > > that >> > > > > any printing, copying, dissemination, distribution, disclosure or >> > > > > forwarding of this communication is strictly prohibited. If you >> have >> > > > > received this communication in error, please contact the sender >> > > > immediately >> > > > > and delete it from your system. Thank You. >> > > > > >> > > > >> > > > >> > > > >> > > > -- >> > > > >> > > > >> > > > -- >> > > > CONFIDENTIALITY NOTICE >> > > > NOTICE: This message is intended for the use of the individual or >> > entity >> > > to >> > > > which it is addressed and may contain information that is >> confidential, >> > > > privileged and exempt from disclosure under applicable law. If the >> > reader >> > > > of this message is not the intended recipient, you are hereby >> notified >> > > that >> > > > any printing, copying, dissemination, distribution, disclosure or >> > > > forwarding of this communication is strictly prohibited. If you have >> > > > received this communication in error, please contact the sender >> > > immediately >> > > > and delete it from your system. Thank You. >> > > > >> > > >> > >> > -- >> > CONFIDENTIALITY NOTICE >> > NOTICE: This message is intended for the use of the individual or >> entity to >> > which it is addressed and may contain information that is confidential, >> > privileged and exempt from disclosure under applicable law. If the >> reader >> > of this message is not the intended recipient, you are hereby notified >> that >> > any printing, copying, dissemination, distribution, disclosure or >> > forwarding of this communication is strictly prohibited. If you have >> > received this communication in error, please contact the sender >> immediately >> > and delete it from your system. Thank You. >> > >> > > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. --20cf303a3107cd2b6f0503c44e1f--