hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Wang <andrew.w...@cloudera.com>
Subject Re: [VOTE] Merge HDFS-6581 to trunk - Writing to replicas in memory.
Date Tue, 23 Sep 2014 21:33:54 GMT
Hi Arpit,

Here is the comment. It was certainly not my intention to misquote anyone.

https://issues.apache.org/jira/browse/HDFS-6581?focusedCommentId=14138223&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14138223

Quote:

It would be nice to see that would could get a substantial fraction of
memory bandwidth when writing to a single replica in-memory.

The comparison will be interesting but I can tell you without measurement
it is not going to be a substantial fraction of memory bandwidth. We are
still going through DataTransferProtocol with all the copies and overhead
that involves.

When the goal is in-memory writes and we are unable to achieve a
substantial fraction of memory bandwidth, to me that is "not good
performance."

I also looked through the subtasks, and AFAICT the only one related to
improving this is deferring checksum computation. The benchmarking we did
on HDFS-4949 showed that this only really helps when you're down to single
copy or zero copies with SCR/ZCR. DTP reads didn't see much of an
improvement, so I'd guess the same would be true for DTP writes.

I think my above three questions are still open, as well as my question
about why we're merging now, as opposed to when the performance of the
branch is proven out.

Thanks,
Andrew

On Tue, Sep 23, 2014 at 2:10 PM, Arpit Agarwal <aagarwal@hortonworks.com>
wrote:

> Andrew, don't misquote me. Can you link the comment where I said
> performance wasn't going to be good?
>
> I will add some add some preliminary write results to the Jira later today.
>
> > What's the plan to improve write performance?
> I described this in response to your and Colin's comments on the Jira.
>
> For the benefit of folks not following the Jira, the immediate task we'd
> like to get done post-merge is moving checksum computation off the write
> path. Also see open subtasks of HDFS-6581 for other planned perf
> improvements.
>
> Thanks,
> Arpit
>
>
> On Tue, Sep 23, 2014 at 1:07 PM, Andrew Wang <andrew.wang@cloudera.com>
> wrote:
>
> > Hi Arpit,
> >
> > On HDFS-6581, I asked for write benchmarks on Sep 19th, and you responded
> > that the performance wasn't going to be good. However, I thought the
> > primary goal of this JIRA was to improve write performance, and write
> > performance is listed as the first feature requirement in the design doc.
> >
> > So, this leads me to a few questions, which I also asked last week on the
> > JIRA (I believe still unanswered):
> >
> > - What's the plan to improve write performance?
> > - What kind of performance can we expect after the plan is completed?
> > - Can this expected performance be validated with a prototype?
> >
> > Even with these questions answered, I don't understand the need to merge
> > this before the write optimization work is completed. Write perf is
> listed
> > as a feature requirement, so the branch can reasonably be called not
> > feature complete until it's shown to be faster.
> >
> > Thanks,
> > Andrew
> >
> > On Tue, Sep 23, 2014 at 11:47 AM, Jitendra Pandey <
> > jitendra@hortonworks.com>
> > wrote:
> >
> > > +1. I have reviewed most of the code in the branch, and I think its
> ready
> > > to be merged to trunk.
> > >
> > >
> > > On Mon, Sep 22, 2014 at 5:24 PM, Arpit Agarwal <
> aagarwal@hortonworks.com
> > >
> > > wrote:
> > >
> > > > HDFS Devs,
> > > >
> > > > We propose merging the HDFS-6581 development branch to trunk.
> > > >
> > > > The work adds support to write to HDFS blocks in memory. The target
> use
> > > > case covers applications writing relatively small, intermediate data
> > sets
> > > > with low latency. We introduce a new CreateFlag for the existing
> > > CreateFile
> > > > API. HDFS will subsequently attempt to place replicas of file blocks
> in
> > > > local memory with disk writes occurring off the hot path. The current
> > > > design is a simplification of original ideas from Sanjay Radia on
> > > > HDFS-5851.
> > > >
> > > > Key goals of the feature were minimal API changes to reduce
> application
> > > > burden and best effort data durability. The feature is optional and
> > > > requires appropriate DN configuration from administrators.
> > > >
> > > > Design doc:
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/attachment/12661926/HDFSWriteableReplicasInMemory.pdf
> > > >
> > > > Test plan:
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/attachment/12669452/Test-Plan-for-HDFS-6581-Memory-Storage.pdf
> > > >
> > > > There are 28 resolved sub-tasks under HDFS-6581, 3 open tasks for
> > > > tests+Jenkins issues  and 7 open subtasks tracking planned
> > improvements.
> > > > The latest merge patch is 3300 lines of changed code of which 1300
> > lines
> > > is
> > > > new and updated tests. Merging the branch to trunk will allow HDFS
> > > > applications to start evaluating the feature. We will continue work
> on
> > > > documentation, performance tuning and metrics in parallel with the
> vote
> > > and
> > > > post-merge.
> > > >
> > > > Contributors to design and code include Xiaoyu Yao, Sanjay Radia,
> > > Jitendra
> > > > Pandey, Tassapol Athiapinya, Gopal V, Bikas Saha, Vikram Dixit,
> Suresh
> > > > Srinivas and Chris Nauroth.
> > > >
> > > > Thanks to Haohui Mai, Colin Patrick McCabe, Andrew Wang, Todd Lipcon,
> > > Eric
> > > > Baldeschwieler and Vinayakumar B for providing useful feedback on
> > > > HDFS-6581, HDFS-5851 and sub-tasks.
> > > >
> > > > The vote runs for the usual 7 days and will expire at 12am PDT on Sep
> > 30.
> > > > Here is my +1 for the merge.
> > > >
> > > > Regards,
> > > > Arpit
> > > >
> > > > --
> > > > CONFIDENTIALITY NOTICE
> > > > NOTICE: This message is intended for the use of the individual or
> > entity
> > > to
> > > > which it is addressed and may contain information that is
> confidential,
> > > > privileged and exempt from disclosure under applicable law. If the
> > reader
> > > > of this message is not the intended recipient, you are hereby
> notified
> > > that
> > > > any printing, copying, dissemination, distribution, disclosure or
> > > > forwarding of this communication is strictly prohibited. If you have
> > > > received this communication in error, please contact the sender
> > > immediately
> > > > and delete it from your system. Thank You.
> > > >
> > >
> > >
> > >
> > > --
> > > <http://hortonworks.com/download/>
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message