hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Wang <andrew.w...@cloudera.com>
Subject Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
Date Mon, 19 Oct 2015 20:44:50 GMT
I think our plan thus far has been to target this for 3.0. I'm okay with
putting it in branch-2 if we've given a hard look at compatibility, but
I'll note though that 2.8 is already looking like quite a large release,
and our release bandwidth has been focused on the 2.6 and 2.7 maintenance
releases. Adding another multi-hundred JIRAs to 2.8 might make it too
unwieldy to get out the door. If we bump EC past that, 3.0 might very well
be our next release vehicle. I do plan to revive the 3.0 schedule some time
next year. With EC and JDK8 in a good spot, the only big feature remaining
is classpath isolation.

EC is also a pretty fundamental change to HDFS. Even if it's compatible, in
terms of size and impact it might best belong in a new major release.

Best,
Andrew

On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
vinayakumarb.apache@gmail.com> wrote:

> Is anyone else also thinks that feature is ready to goto branch-2  as well?
>
> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since then and
> ready to go in branch-2.
>
> -Vinay
> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zhezhang@cloudera.com> wrote:
>
> > Thanks Vinay for capturing the issue and Uma for offering the help.
> >
> > ---
> > Zhe Zhang
> >
> > On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> uma.gangumalla@intel.com
> > >
> > wrote:
> >
> > > Vinay,
> > >
> > >
> > >  I would merge them as part of HDFS-9182.
> > >
> > > Thanks,
> > > Uma
> > >
> > >
> > >
> > > On 10/5/15, 12:48 AM, "Vinayakumar B" <vinayakumarb@apache.org> wrote:
> > >
> > > >Hi Andrew,
> > > > I see CHANGES.txt entries not yet merged from
> CHANGES-HDFS-EC-7285.txt.
> > > >
> > > > Was this intentional?
> > > >
> > > >Regards,
> > > >Vinay
> > > >
> > > >On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> andrew.wang@cloudera.com>
> > > >wrote:
> > > >
> > > >> Branch has been merged to trunk, thanks again to everyone who worked
> > on
> > > >>the
> > > >> feature!
> > > >>
> > > >> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zhezhang@cloudera.com>
> > > >>wrote:
> > > >>
> > > >> > Thanks everyone who has participated in this discussion.
> > > >> >
> > > >> > With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
> has
> > > >> passed.
> > > >> > I will do a final 'git merge' with trunk and work with Andrew
to
> > merge
> > > >> the
> > > >> > branch to trunk. I'll update on this thread when the merge is
> done.
> > > >> >
> > > >> > ---
> > > >> > Zhe Zhang
> > > >> >
> > > >> > On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi.a.liu@intel.com>
> > > >>wrote:
> > > >> >
> > > >> > > (Change it to binding.)
> > > >> > >
> > > >> > > +1
> > > >> > > I have been involved in the development and code review
on the
> > > >>feature
> > > >> > > branch. It's a great feature and I think it's ready to merge
it
> > into
> > > >> > trunk.
> > > >> > >
> > > >> > > Thanks all for the contribution.
> > > >> > >
> > > >> > > Regards,
> > > >> > > Yi Liu
> > > >> > >
> > > >> > >
> > > >> > > -----Original Message-----
> > > >> > > From: Liu, Yi A
> > > >> > > Sent: Friday, September 25, 2015 1:51 PM
> > > >> > > To: hdfs-dev@hadoop.apache.org
> > > >> > > Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch
to
> > trunk
> > > >> > >
> > > >> > > +1 (non-binding)
> > > >> > > I have been involved in the development and code review
on the
> > > >>feature
> > > >> > > branch. It's a great feature and I think it's ready to merge
it
> > into
> > > >> > trunk.
> > > >> > >
> > > >> > > Thanks all for the contribution.
> > > >> > >
> > > >> > > Regards,
> > > >> > > Yi Liu
> > > >> > >
> > > >> > >
> > > >> > > -----Original Message-----
> > > >> > > From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > > >> > > Sent: Friday, September 25, 2015 12:21 PM
> > > >> > > To: hdfs-dev@hadoop.apache.org
> > > >> > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
to
> > trunk
> > > >> > >
> > > >> > > +1,
> > > >> > >
> > > >> > > I've been involved starting from design and development
of
> > > >> ErasureCoding.
> > > >> > > I think phase 1 of this development is ready to be merged
to
> > trunk.
> > > >> > > It had come a long way to the current state with significant
> > effort
> > > >>of
> > > >> > > many Contributors and Reviewers for both design and code.
> > > >> > >
> > > >> > > Thanks Everyone for the efforts.
> > > >> > >
> > > >> > > Regards,
> > > >> > > Vinay
> > > >> > >
> > > >> > > On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <jing9@apache.org>
> > > >>wrote:
> > > >> > >
> > > >> > > > +1
> > > >> > > >
> > > >> > > > I've been involved in both development and review on
the
> branch,
> > > >>and
> > > >> I
> > > >> > > > believe it's now ready to get merged into trunk. Many
thanks
> to
> > > >>all
> > > >> > > > the contributors and reviewers!
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > -Jing
> > > >> > > >
> > > >> > > > On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
> > kai.zheng@intel.com>
> > > >> > wrote:
> > > >> > > >
> > > >> > > > > Non-binding +1
> > > >> > > > >
> > > >> > > > > According to our extensive performance tests,
striping +
> ISA-L
> > > >> coder
> > > >> > > > based
> > > >> > > > > erasure coding not only can save storage, but
also can
> > increase
> > > >>the
> > > >> > > > > throughput of a client or a cluster. It will be
a great
> > > >>addition to
> > > >> > > > > HDFS and its users. Based on the latest branch
codes, we
> also
> > > >> > > > > observed it's
> > > >> > > > very
> > > >> > > > > reliable in the concurrent tests. We'll provide
the perf
> test
> > > >> report
> > > >> > > > after
> > > >> > > > > it's sorted out and hope it helps.
> > > >> > > > > Thanks!
> > > >> > > > >
> > > >> > > > > Regards,
> > > >> > > > > Kai
> > > >> > > > >
> > > >> > > > > -----Original Message-----
> > > >> > > > > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > > >> > > > > Sent: Wednesday, September 23, 2015 8:50 AM
> > > >> > > > > To: hdfs-dev@hadoop.apache.org;
> common-dev@hadoop.apache.org
> > > >> > > > > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding)
branch
> to
> > > >> trunk
> > > >> > > > >
> > > >> > > > > +1
> > > >> > > > >
> > > >> > > > > Great addition to HDFS. Thanks all contributors
for the nice
> > > >>work.
> > > >> > > > >
> > > >> > > > > Regards,
> > > >> > > > > Uma
> > > >> > > > >
> > > >> > > > > On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezhang@cloudera.com>
> > wrote:
> > > >> > > > >
> > > >> > > > > >Hi,
> > > >> > > > > >
> > > >> > > > > >I'd like to propose a vote to merge the HDFS-7285
feature
> > > >>branch
> > > >> > > > > >back to trunk. Since November 2014 we have
been designing
> and
> > > >> > > > > >developing this feature under the umbrella
JIRAs HDFS-7285
> > and
> > > >> > > > > >HADOOP-11264, and have committed approximately
210 patches.
> > > >> > > > > >
> > > >> > > > > >The HDFS-7285 feature branch was created to
support the
> first
> > > >> phase
> > > >> > > > > >of HDFS erasure coding (HDFS-EC). The objective
of HDFS-EC
> is
> > > >>to
> > > >> > > > > >significantly reduce storage space usage in
HDFS clusters.
> > > >>Instead
> > > >> > > > > >of always creating 3 replicas of each block
with 200%
> storage
> > > >> space
> > > >> > > > > >overhead, HDFS-EC provides data durability
through parity
> > data
> > > >> > blocks.
> > > >> > > > > >With most EC configurations, the storage overhead
is no
> more
> > > >>than
> > > >> > 50%.
> > > >> > > > > >Based on profiling results of production clusters,
we
> decided
> > > >>to
> > > >> > > > > >support EC with the striped block layout in
the first
> phase,
> > so
> > > >> > > > > >that small files can be better handled. This
means dividing
> > > >>each
> > > >> > > > > >logical HDFS file block into smaller units
(striping cells)
> > and
> > > >> > > > > >spreading them on a set of DataNodes in round-robin
> fashion.
> > > >> Parity
> > > >> > > > > >cells are generated for each stripe of original
data cells.
> > We
> > > >> have
> > > >> > > > > >made changes to NameNode, client, and DataNode
to
> generalize
> > > >>the
> > > >> > > > > >block concept and handle the mapping between
a logical file
> > > >>block
> > > >> > > > > >and its internal storage blocks. For further
details please
> > see
> > > >> the
> > > >> > > > > >design doc on HDFS-7285.
> > > >> > > > > >HADOOP-11264 focuses on providing flexible
and
> > high-performance
> > > >> > > > > >codec calculation support.
> > > >> > > > > >
> > > >> > > > > >The nightly Jenkins job of the branch has
reported several
> > > >> > > > > >successful runs, and doesn't show new flaky
tests compared
> > with
> > > >> > > > > >trunk. We have posted several versions of
the test plan
> > > >>including
> > > >> > > > > >both unit testing and cluster testing, and
have executed
> most
> > > >> tests
> > > >> > > > > >in the plan. The most basic functionalities
have been
> > > >>extensively
> > > >> > > > > >tested and verified in several real clusters
with different
> > > >> > > > > >hardware configurations; results have been
very stable. We
> > have
> > > >> > > > > >created follow-on tasks for more advanced
error handling
> and
> > > >> > > optimization under the umbrella HDFS-8031.
> > > >> > > > > >We also plan to implement or harden the integration
of EC
> > with
> > > >> > > > > >existing features such as WebHDFS, snapshot,
append,
> > truncate,
> > > >> > > > > >hflush, hsync, and so forth.
> > > >> > > > > >
> > > >> > > > > >Development of this feature has been a collaboration
across
> > > >>many
> > > >> > > > > >companies and institutions. I'd like to thank
J. Andreina,
> > > >> Takanobu
> > > >> > > > > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome,
Uma
> Maheswara
> > > >>Rao
> > > >> > > > > >G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin,
Rakesh R, Gao
> > Rui,
> > > >> Kai
> > > >> > > > > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew
Wang, Yong
> > > >>Zhang,
> > > >> > > > > >Jing Zhao, Hui Zheng and Kai Zheng for their
code
> > contributions
> > > >> and
> > > >> > > reviews.
> > > >> > > > > >Andrew and Kai Zheng also made fundamental
contributions to
> > the
> > > >> > > > > >initial design. Rui Li, Gao Rui, Kai Sasaki,
Kai Zheng and
> > many
> > > >> > > > > >other contributors have made great efforts
in system
> testing.
> > > >>Many
> > > >> > > > > >thanks go to Weihua Jiang for proposing the
JIRA, and ATM,
> > Todd
> > > >> > > > > >Lipcon, Silvius Rus, Suresh, as well as many
others for
> > > >>providing
> > > >> > > helpful feedbacks.
> > > >> > > > > >
> > > >> > > > > >Following the community convention, this vote
will last
> for 7
> > > >>days
> > > >> > > > > >(ending September 29th). Votes from Hadoop
committers are
> > > >>binding
> > > >> > > > > >but non-binding votes are very welcome as
well. And here's
> my
> > > >> > > > > >non-binding
> > > >> > > > +1.
> > > >> > > > > >
> > > >> > > > > >Thanks,
> > > >> > > > > >---
> > > >> > > > > >Zhe Zhang
> > > >> > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message