hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Srinivas <sur...@hortonworks.com>
Subject Re: [VOTE] Merge HDFS-6581 to trunk - Writing to replicas in memory.
Date Wed, 24 Sep 2014 18:13:54 GMT
I am +1 on the merge.

On Wed, Sep 24, 2014 at 11:12 AM, Suresh Srinivas <suresh@hortonworks.com>
wrote:

> Features are done in a separate branch for two reasons: 1) During a
> feature development the branch may be not functional 2) The high level
> approach and design is not very clear and development can continue while
> that is being sorted out. In case of this feature, clearly (2) is not an
> issue. We have had enough discussion about the approach. I also think this
> branch is ready for merge without rendering trunk not functional. That
> leaves us with one objection that is being raised; the content is not
> complete. This did not prevent recently merged crypto file system work from
> being merged to trunk. We discussed and decided that it can be merged to
> trunk and we will finish the remaining work that is a must have in trunk
> before merging to branch-2. That is a reasonable approach for this feature
> as well. In fact there are a lot of bug fixes and improvements still
> happening on crypto work even after branch-2 merge (which is a good thing
> in my opinion, it shows that the testing is continuing and feature is still
> being improved).
>
> As regards to the eviction policy we have had discussion in the jira. I
> disagree that there exists a "usable" or "better" strategy that works well
> for all the work loads. I also disagree that additional policy to the
> existing LRU is needed, leave alone it being needed for trunk merge. New
> eviction policies need to be developed as people experiment with it. Hence
> making the eviction policy pluggable is great way to approach it.
>
> There are some testing information posted on the jira, such as
> performance. But I think we need more details on how this feature is being
> tested. Arpit, can you please post test details. As for me, given how much
> work (both discussion and coding) has gone into it, and given it is a very
> important feature that allows experimentation to use memory tier, I would
> like to see this available in release 2.6.
>
> On Tue, Sep 23, 2014 at 6:09 PM, Colin McCabe <cmccabe@alumni.cmu.edu>
> wrote:
>
>> This seems like a really aggressive timeframe for a merge.  We still
>> haven't implemented:
>>
>> * Checksum skipping on read and write from lazy persisted replicas.
>> * Allowing mmaped reads from the lazy persisted data.
>> * Any eviction strategy other than LRU.
>> * Integration with cache pool limits (how do HDFS-4949 and lazy
>> persist replicas share memory)?
>> * Eviction from RAM disk via truncation (HDFS-6918)
>> * Metrics
>> * System testing to find out how useful this is, and what the best
>> eviction strategy is.
>>
>> I see why we might want to defer checksum skipping, metrics, allowing
>> mmap, eviction via truncation, and so forth until later.  But I feel
>> like we need to figure out how this will integrate with the memory
>> used by HDFS-4949 before we merge.  I also would like to see another
>> eviction strategy other than LRU, which is a very poor eviction
>> strategy for scanning workloads.  I mentioned this a few times on the
>> JIRA.
>>
>> I'd also like to get some idea of how much testing this has received
>> in a multi-node cluster.  What makes us confident that this is the
>> right time to merge, rather than in a week or two?
>>
>> best,
>> Colin
>>
>>
>> On Tue, Sep 23, 2014 at 4:55 PM, Arpit Agarwal <aagarwal@hortonworks.com>
>> wrote:
>> > I have posted write benchmark results to the Jira.
>> >
>> > On Tue, Sep 23, 2014 at 3:41 PM, Arpit Agarwal <
>> aagarwal@hortonworks.com>
>> > wrote:
>> >
>> >> Hi Andrew, I said "it is not going to be a substantial fraction of
>> memory
>> >> bandwidth". That is certainly not the same as saying it won't be good
>> or
>> >> there won't be any improvement.
>> >>
>> >> Any time you have transfers over RPC or the network stack you will not
>> get
>> >> close to the memory bandwidth even for intra-host transfers.
>> >>
>> >> I'll add some micro-benchmark results to the Jira shortly.
>> >>
>> >> Thanks,
>> >> Arpit
>> >>
>> >> On Tue, Sep 23, 2014 at 2:33 PM, Andrew Wang <andrew.wang@cloudera.com
>> >
>> >> wrote:
>> >>
>> >>> Hi Arpit,
>> >>>
>> >>> Here is the comment. It was certainly not my intention to misquote
>> anyone.
>> >>>
>> >>>
>> >>>
>> https://issues.apache.org/jira/browse/HDFS-6581?focusedCommentId=14138223&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14138223
>> >>>
>> >>> Quote:
>> >>>
>> >>> It would be nice to see that would could get a substantial fraction
of
>> >>> memory bandwidth when writing to a single replica in-memory.
>> >>>
>> >>> The comparison will be interesting but I can tell you without
>> measurement
>> >>> it is not going to be a substantial fraction of memory bandwidth. We
>> are
>> >>> still going through DataTransferProtocol with all the copies and
>> overhead
>> >>> that involves.
>> >>>
>> >>> When the goal is in-memory writes and we are unable to achieve a
>> >>> substantial fraction of memory bandwidth, to me that is "not good
>> >>> performance."
>> >>>
>> >>> I also looked through the subtasks, and AFAICT the only one related
to
>> >>> improving this is deferring checksum computation. The benchmarking we
>> did
>> >>> on HDFS-4949 showed that this only really helps when you're down to
>> single
>> >>> copy or zero copies with SCR/ZCR. DTP reads didn't see much of an
>> >>> improvement, so I'd guess the same would be true for DTP writes.
>> >>>
>> >>> I think my above three questions are still open, as well as my
>> question
>> >>> about why we're merging now, as opposed to when the performance of the
>> >>> branch is proven out.
>> >>>
>> >>> Thanks,
>> >>> Andrew
>> >>>
>> >>> On Tue, Sep 23, 2014 at 2:10 PM, Arpit Agarwal <
>> aagarwal@hortonworks.com>
>> >>> wrote:
>> >>>
>> >>> > Andrew, don't misquote me. Can you link the comment where I said
>> >>> > performance wasn't going to be good?
>> >>> >
>> >>> > I will add some add some preliminary write results to the Jira
later
>> >>> today.
>> >>> >
>> >>> > > What's the plan to improve write performance?
>> >>> > I described this in response to your and Colin's comments on the
>> Jira.
>> >>> >
>> >>> > For the benefit of folks not following the Jira, the immediate
task
>> we'd
>> >>> > like to get done post-merge is moving checksum computation off
the
>> write
>> >>> > path. Also see open subtasks of HDFS-6581 for other planned perf
>> >>> > improvements.
>> >>> >
>> >>> > Thanks,
>> >>> > Arpit
>> >>> >
>> >>> >
>> >>> > On Tue, Sep 23, 2014 at 1:07 PM, Andrew Wang <
>> andrew.wang@cloudera.com>
>> >>> > wrote:
>> >>> >
>> >>> > > Hi Arpit,
>> >>> > >
>> >>> > > On HDFS-6581, I asked for write benchmarks on Sep 19th, and
you
>> >>> responded
>> >>> > > that the performance wasn't going to be good. However, I thought
>> the
>> >>> > > primary goal of this JIRA was to improve write performance,
and
>> write
>> >>> > > performance is listed as the first feature requirement in
the
>> design
>> >>> doc.
>> >>> > >
>> >>> > > So, this leads me to a few questions, which I also asked last
>> week on
>> >>> the
>> >>> > > JIRA (I believe still unanswered):
>> >>> > >
>> >>> > > - What's the plan to improve write performance?
>> >>> > > - What kind of performance can we expect after the plan is
>> completed?
>> >>> > > - Can this expected performance be validated with a prototype?
>> >>> > >
>> >>> > > Even with these questions answered, I don't understand the
need to
>> >>> merge
>> >>> > > this before the write optimization work is completed. Write
perf
>> is
>> >>> > listed
>> >>> > > as a feature requirement, so the branch can reasonably be
called
>> not
>> >>> > > feature complete until it's shown to be faster.
>> >>> > >
>> >>> > > Thanks,
>> >>> > > Andrew
>> >>> > >
>> >>> > > On Tue, Sep 23, 2014 at 11:47 AM, Jitendra Pandey <
>> >>> > > jitendra@hortonworks.com>
>> >>> > > wrote:
>> >>> > >
>> >>> > > > +1. I have reviewed most of the code in the branch, and
I think
>> its
>> >>> > ready
>> >>> > > > to be merged to trunk.
>> >>> > > >
>> >>> > > >
>> >>> > > > On Mon, Sep 22, 2014 at 5:24 PM, Arpit Agarwal <
>> >>> > aagarwal@hortonworks.com
>> >>> > > >
>> >>> > > > wrote:
>> >>> > > >
>> >>> > > > > HDFS Devs,
>> >>> > > > >
>> >>> > > > > We propose merging the HDFS-6581 development branch
to trunk.
>> >>> > > > >
>> >>> > > > > The work adds support to write to HDFS blocks in
memory. The
>> >>> target
>> >>> > use
>> >>> > > > > case covers applications writing relatively small,
>> intermediate
>> >>> data
>> >>> > > sets
>> >>> > > > > with low latency. We introduce a new CreateFlag
for the
>> existing
>> >>> > > > CreateFile
>> >>> > > > > API. HDFS will subsequently attempt to place replicas
of file
>> >>> blocks
>> >>> > in
>> >>> > > > > local memory with disk writes occurring off the
hot path. The
>> >>> current
>> >>> > > > > design is a simplification of original ideas from
Sanjay
>> Radia on
>> >>> > > > > HDFS-5851.
>> >>> > > > >
>> >>> > > > > Key goals of the feature were minimal API changes
to reduce
>> >>> > application
>> >>> > > > > burden and best effort data durability. The feature
is
>> optional
>> >>> and
>> >>> > > > > requires appropriate DN configuration from administrators.
>> >>> > > > >
>> >>> > > > > Design doc:
>> >>> > > > >
>> >>> > > > >
>> >>> > > >
>> >>> > >
>> >>> >
>> >>>
>> https://issues.apache.org/jira/secure/attachment/12661926/HDFSWriteableReplicasInMemory.pdf
>> >>> > > > >
>> >>> > > > > Test plan:
>> >>> > > > >
>> >>> > > > >
>> >>> > > >
>> >>> > >
>> >>> >
>> >>>
>> https://issues.apache.org/jira/secure/attachment/12669452/Test-Plan-for-HDFS-6581-Memory-Storage.pdf
>> >>> > > > >
>> >>> > > > > There are 28 resolved sub-tasks under HDFS-6581,
3 open tasks
>> for
>> >>> > > > > tests+Jenkins issues  and 7 open subtasks tracking
planned
>> >>> > > improvements.
>> >>> > > > > The latest merge patch is 3300 lines of changed
code of which
>> 1300
>> >>> > > lines
>> >>> > > > is
>> >>> > > > > new and updated tests. Merging the branch to trunk
will allow
>> HDFS
>> >>> > > > > applications to start evaluating the feature. We
will continue
>> >>> work
>> >>> > on
>> >>> > > > > documentation, performance tuning and metrics in
parallel
>> with the
>> >>> > vote
>> >>> > > > and
>> >>> > > > > post-merge.
>> >>> > > > >
>> >>> > > > > Contributors to design and code include Xiaoyu Yao,
Sanjay
>> Radia,
>> >>> > > > Jitendra
>> >>> > > > > Pandey, Tassapol Athiapinya, Gopal V, Bikas Saha,
Vikram
>> Dixit,
>> >>> > Suresh
>> >>> > > > > Srinivas and Chris Nauroth.
>> >>> > > > >
>> >>> > > > > Thanks to Haohui Mai, Colin Patrick McCabe, Andrew
Wang, Todd
>> >>> Lipcon,
>> >>> > > > Eric
>> >>> > > > > Baldeschwieler and Vinayakumar B for providing useful
>> feedback on
>> >>> > > > > HDFS-6581, HDFS-5851 and sub-tasks.
>> >>> > > > >
>> >>> > > > > The vote runs for the usual 7 days and will expire
at 12am
>> PDT on
>> >>> Sep
>> >>> > > 30.
>> >>> > > > > Here is my +1 for the merge.
>> >>> > > > >
>> >>> > > > > Regards,
>> >>> > > > > Arpit
>> >>> > > > >
>> >>> > > > > --
>> >>> > > > > CONFIDENTIALITY NOTICE
>> >>> > > > > NOTICE: This message is intended for the use of
the
>> individual or
>> >>> > > entity
>> >>> > > > to
>> >>> > > > > which it is addressed and may contain information
that is
>> >>> > confidential,
>> >>> > > > > privileged and exempt from disclosure under applicable
law.
>> If the
>> >>> > > reader
>> >>> > > > > of this message is not the intended recipient, you
are hereby
>> >>> > notified
>> >>> > > > that
>> >>> > > > > any printing, copying, dissemination, distribution,
>> disclosure or
>> >>> > > > > forwarding of this communication is strictly prohibited.
If
>> you
>> >>> have
>> >>> > > > > received this communication in error, please contact
the
>> sender
>> >>> > > > immediately
>> >>> > > > > and delete it from your system. Thank You.
>> >>> > > > >
>> >>> > > >
>> >>> > > >
>> >>> > > >
>> >>> > > > --
>> >>> > > > <http://hortonworks.com/download/>
>> >>> > > >
>> >>> > > > --
>> >>> > > > CONFIDENTIALITY NOTICE
>> >>> > > > NOTICE: This message is intended for the use of the individual
>> or
>> >>> > entity
>> >>> > > to
>> >>> > > > which it is addressed and may contain information that
is
>> >>> confidential,
>> >>> > > > privileged and exempt from disclosure under applicable
law. If
>> the
>> >>> > reader
>> >>> > > > of this message is not the intended recipient, you are
hereby
>> >>> notified
>> >>> > > that
>> >>> > > > any printing, copying, dissemination, distribution, disclosure
>> or
>> >>> > > > forwarding of this communication is strictly prohibited.
If you
>> have
>> >>> > > > received this communication in error, please contact
the sender
>> >>> > > immediately
>> >>> > > > and delete it from your system. Thank You.
>> >>> > > >
>> >>> > >
>> >>> >
>> >>> > --
>> >>> > CONFIDENTIALITY NOTICE
>> >>> > NOTICE: This message is intended for the use of the individual
or
>> >>> entity to
>> >>> > which it is addressed and may contain information that is
>> confidential,
>> >>> > privileged and exempt from disclosure under applicable law. If
the
>> >>> reader
>> >>> > of this message is not the intended recipient, you are hereby
>> notified
>> >>> that
>> >>> > any printing, copying, dissemination, distribution, disclosure
or
>> >>> > forwarding of this communication is strictly prohibited. If you
have
>> >>> > received this communication in error, please contact the sender
>> >>> immediately
>> >>> > and delete it from your system. Thank You.
>> >>> >
>> >>>
>> >>
>> >>
>> >
>> > --
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or
>> entity to
>> > which it is addressed and may contain information that is confidential,
>> > privileged and exempt from disclosure under applicable law. If the
>> reader
>> > of this message is not the intended recipient, you are hereby notified
>> that
>> > any printing, copying, dissemination, distribution, disclosure or
>> > forwarding of this communication is strictly prohibited. If you have
>> > received this communication in error, please contact the sender
>> immediately
>> > and delete it from your system. Thank You.
>>
>
>
>
> --
> http://hortonworks.com/download/
>



-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message