hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Wang <andrew.w...@cloudera.com>
Subject Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]
Date Tue, 03 Nov 2015 00:48:43 GMT
If we use an umbrella JIRA to categorize all the ongoing EC work, that will
let us more easily change the target version later. For instance, if we
decide to bump Phase II out of 2.9, then we just need to change the target
version of the Phase II umbrella rather than all the subtasks.

On Mon, Nov 2, 2015 at 4:26 PM, Zheng, Kai <kai.zheng@intel.com> wrote:

> Yeah, so for the issues we recently resolved on trunk and are addressing
> as follow-on tasks in Phase I, we would label them with "erasure coding"
> and maybe also set the target version as "2.9" for the convenience?
>
> -----Original Message-----
> From: Jing Zhao [mailto:jing9@apache.org]
> Sent: Tuesday, November 03, 2015 8:04 AM
> To: hdfs-dev@hadoop.apache.org
> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285
> (erasure coding) branch to trunk]
>
> +1 for the plan about Phase I & II.
>
> BTW, maybe out of the scope of this thread, just want to mention we should
> either move the jira under HDFS-8031 or update the jira component as
> "erasure-coding" when making further improvement or fixing bugs in EC. In
> this way it will be easier for later backporting EC to 2.9.
>
> On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B <
> vinayakumarb.apache@gmail.com
> > wrote:
>
> > +1 for the idea.
> > On Nov 3, 2015 07:22, "Zheng, Kai" <kai.zheng@intel.com> wrote:
> >
> > > Sounds good to me. When it's determined to include EC in 2.9
> > > release, it may be good to have a rough release date as Zhe asked,
> > > so accordingly the scope of EC can be discussed out. We still have
> > > quite a few of things as Phase I follow-on tasks to do before EC can
> > > be deployed in a production system. Phase II to develop non-striping
> > > EC for cold data would possibly
> > be
> > > started after that. We might consider to include only Phase I and
> > > leave Phase II for next release according to the rough release date.
> > >
> > > Regards,
> > > Kai
> > >
> > > -----Original Message-----
> > > From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > > Sent: Tuesday, November 03, 2015 5:41 AM
> > > To: hdfs-dev@hadoop.apache.org
> > > Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge
> > > HDFS-7285 (erasure coding) branch to trunk]
> > >
> > > +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we
> > > +plan to
> > > have 2.8 and 2.9 releases.
> > >
> > > Regards,
> > > Uma
> > >
> > > On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vinodkv@hortonworks.com>
> > wrote:
> > >
> > > >Forking the thread. Started looking at the 2.8 list, various
> > > >features¹ status and arrived here.
> > > >
> > > >While I understand the pervasive nature of EC and a need for a
> > > >significant bake-in, moving this to a 3.x release is not a good idea.
> > > >We will surely get a 2.8 out this year and, as needed, I can even
> > > >spend time getting started on a 2.9. OTOH, 3.x is long ways off,
> > > >and given all the incompatibilities there, it would be a while
> > > >before users can get their hands on EC if it were to be only on
> > > >3.x. At best, this may force sites that want EC to backport the
> > > >entire EC feature to older releases, at worst this will be repeat
> > > >the mess of 0.20 security release
> > > forks.
> > > >
> > > >If we think adding this to 2.8 (even if it switched off) is too
> > > >much risk per our original plan, let¹s move this to 2.9, there by
> > > >leaving enough time for stability, integration testing and bake-in,
> > > >and a realistic chance of having it end up on users¹ clusters soonish.
> > > >
> > > >+Vinod
> > > >
> > > >> On Oct 19, 2015, at 1:44 PM, Andrew Wang
> > > >><andrew.wang@cloudera.com>
> > > >>wrote:
> > > >>
> > > >> I think our plan thus far has been to target this for 3.0. I'm
> > > >>okay with  putting it in branch-2 if we've given a hard look at
> > > >>compatibility, but  I'll note though that 2.8 is already looking
> > > >>like quite a large release,  and our release bandwidth has been
> > > >>focused on the 2.6 and 2.7 maintenance  releases. Adding another
> > > >>multi-hundred JIRAs to 2.8 might make it too  unwieldy to get out
> > > >>the door. If we bump EC past that, 3.0 might very well  be our
> > > >>next release vehicle. I do plan to revive the 3.0 schedule some
> > > >>time  next year. With EC and
> > > >>JDK8 in a good spot, the only big feature remaining  is classpath
> > > >>isolation.
> > > >>
> > > >> EC is also a pretty fundamental change to HDFS. Even if it's
> > > >>compatible, in  terms of size and impact it might best belong in a
> > > >>new major release.
> > > >>
> > > >> Best,
> > > >> Andrew
> > > >>
> > > >> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
> > > >> vinayakumarb.apache@gmail.com> wrote:
> > > >>
> > > >>> Is anyone else also thinks that feature is ready to goto
> > > >>>branch-2 as well?
> > > >>>
> > > >>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable
> > > >>>since then and  ready to go in branch-2.
> > > >>>
> > > >>> -Vinay
> > > >>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zhezhang@cloudera.com>
> wrote:
> > > >>>
> > > >>>> Thanks Vinay for capturing the issue and Uma for offering
the
> help.
> > > >>>>
> > > >>>> ---
> > > >>>> Zhe Zhang
> > > >>>>
> > > >>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
> > > >>> uma.gangumalla@intel.com
> > > >>>>>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Vinay,
> > > >>>>>
> > > >>>>>
> > > >>>>> I would merge them as part of HDFS-9182.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Uma
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> On 10/5/15, 12:48 AM, "Vinayakumar B"
> > > >>>>><vinayakumarb@apache.org>
> > > >>>>>wrote:
> > > >>>>>
> > > >>>>>> Hi Andrew,
> > > >>>>>> I see CHANGES.txt entries not yet merged from
> > > >>> CHANGES-HDFS-EC-7285.txt.
> > > >>>>>>
> > > >>>>>> Was this intentional?
> > > >>>>>>
> > > >>>>>> Regards,
> > > >>>>>> Vinay
> > > >>>>>>
> > > >>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
> > > >>> andrew.wang@cloudera.com>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Branch has been merged to trunk, thanks again
to everyone
> > > >>>>>>>who worked
> > > >>>> on
> > > >>>>>>> the
> > > >>>>>>> feature!
> > > >>>>>>>
> > > >>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang
> > > >>>>>>> <zhezhang@cloudera.com>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Thanks everyone who has participated in this
discussion.
> > > >>>>>>>>
> > > >>>>>>>> With 7 +1's (5 binding and 2 non-binding),
and no -1, this
> > > >>>>>>>> vote
> > > >>> has
> > > >>>>>>> passed.
> > > >>>>>>>> I will do a final 'git merge' with trunk and
work with
> > > >>>>>>>> Andrew to
> > > >>>> merge
> > > >>>>>>> the
> > > >>>>>>>> branch to trunk. I'll update on this thread
when the merge
> > > >>>>>>>> is
> > > >>> done.
> > > >>>>>>>>
> > > >>>>>>>> ---
> > > >>>>>>>> Zhe Zhang
> > > >>>>>>>>
> > > >>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi
A
> > > >>>>>>>> <yi.a.liu@intel.com>
> > > >>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> (Change it to binding.)
> > > >>>>>>>>>
> > > >>>>>>>>> +1
> > > >>>>>>>>> I have been involved in the development
and code review on
> > > >>>>>>>>> the
> > > >>>>>>> feature
> > > >>>>>>>>> branch. It's a great feature and I think
it's ready to
> > > >>>>>>>>> merge it
> > > >>>> into
> > > >>>>>>>> trunk.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks all for the contribution.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Yi Liu
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> -----Original Message-----
> > > >>>>>>>>> From: Liu, Yi A
> > > >>>>>>>>> Sent: Friday, September 25, 2015 1:51
PM
> > > >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> > > >>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure
coding)
> > > >>>>>>>>> branch to
> > > >>>> trunk
> > > >>>>>>>>>
> > > >>>>>>>>> +1 (non-binding)
> > > >>>>>>>>> I have been involved in the development
and code review on
> > > >>>>>>>>> the
> > > >>>>>>> feature
> > > >>>>>>>>> branch. It's a great feature and I think
it's ready to
> > > >>>>>>>>> merge it
> > > >>>> into
> > > >>>>>>>> trunk.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks all for the contribution.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Yi Liu
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> -----Original Message-----
> > > >>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
> > > >>>>>>>>> Sent: Friday, September 25, 2015 12:21
PM
> > > >>>>>>>>> To: hdfs-dev@hadoop.apache.org
> > > >>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure
coding)
> > > >>>>>>>>> branch to
> > > >>>> trunk
> > > >>>>>>>>>
> > > >>>>>>>>> +1,
> > > >>>>>>>>>
> > > >>>>>>>>> I've been involved starting from design
and development of
> > > >>>>>>> ErasureCoding.
> > > >>>>>>>>> I think phase 1 of this development is
ready to be merged
> > > >>>>>>>>> to
> > > >>>> trunk.
> > > >>>>>>>>> It had come a long way to the current
state with
> > > >>>>>>>>> significant
> > > >>>> effort
> > > >>>>>>> of
> > > >>>>>>>>> many Contributors and Reviewers for both
design and code.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks Everyone for the efforts.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Vinay
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing
Zhao
> > > >>>>>>>>> <jing9@apache.org>
> > > >>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> +1
> > > >>>>>>>>>>
> > > >>>>>>>>>> I've been involved in both development
and review on the
> > > >>> branch,
> > > >>>>>>> and
> > > >>>>>>> I
> > > >>>>>>>>>> believe it's now ready to get merged
into trunk. Many
> > > >>>>>>>>>> thanks
> > > >>> to
> > > >>>>>>> all
> > > >>>>>>>>>> the contributors and reviewers!
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thanks,
> > > >>>>>>>>>> -Jing
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng,
Kai <
> > > >>>> kai.zheng@intel.com>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Non-binding +1
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> According to our extensive performance
tests, striping +
> > > >>> ISA-L
> > > >>>>>>> coder
> > > >>>>>>>>>> based
> > > >>>>>>>>>>> erasure coding not only can save
storage, but also can
> > > >>>> increase
> > > >>>>>>> the
> > > >>>>>>>>>>> throughput of a client or a cluster.
It will be a great
> > > >>>>>>> addition to
> > > >>>>>>>>>>> HDFS and its users. Based on the
latest branch codes, we
> > > >>> also
> > > >>>>>>>>>>> observed it's
> > > >>>>>>>>>> very
> > > >>>>>>>>>>> reliable in the concurrent tests.
We'll provide the perf
> > > >>> test
> > > >>>>>>> report
> > > >>>>>>>>>> after
> > > >>>>>>>>>>> it's sorted out and hope it helps.
> > > >>>>>>>>>>> Thanks!
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Regards,
> > > >>>>>>>>>>> Kai
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> -----Original Message-----
> > > >>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
> > > >>>>>>>>>>> Sent: Wednesday, September 23,
2015 8:50 AM
> > > >>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
> > > >>> common-dev@hadoop.apache.org
> > > >>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285
(erasure coding)
> > > >>>>>>>>>>> branch
> > > >>> to
> > > >>>>>>> trunk
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> +1
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Great addition to HDFS. Thanks
all contributors for the
> > > >>>>>>>>>>> nice
> > > >>>>>>> work.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Regards,
> > > >>>>>>>>>>> Uma
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang"
<zhezhang@cloudera.com>
> > > >>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I'd like to propose a vote
to merge the HDFS-7285
> > > >>>>>>>>>>>> feature
> > > >>>>>>> branch
> > > >>>>>>>>>>>> back to trunk. Since November
2014 we have been
> > > >>>>>>>>>>>> designing
> > > >>> and
> > > >>>>>>>>>>>> developing this feature under
the umbrella JIRAs
> > > >>>>>>>>>>>> HDFS-7285
> > > >>>> and
> > > >>>>>>>>>>>> HADOOP-11264, and have committed
approximately 210
> patches.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> The HDFS-7285 feature branch
was created to support the
> > > >>> first
> > > >>>>>>> phase
> > > >>>>>>>>>>>> of HDFS erasure coding (HDFS-EC).
The objective of
> > > >>>>>>>>>>>> HDFS-EC
> > > >>> is
> > > >>>>>>> to
> > > >>>>>>>>>>>> significantly reduce storage
space usage in HDFS clusters.
> > > >>>>>>> Instead
> > > >>>>>>>>>>>> of always creating 3 replicas
of each block with 200%
> > > >>> storage
> > > >>>>>>> space
> > > >>>>>>>>>>>> overhead, HDFS-EC provides
data durability through
> > > >>>>>>>>>>>> parity
> > > >>>> data
> > > >>>>>>>> blocks.
> > > >>>>>>>>>>>> With most EC configurations,
the storage overhead is no
> > > >>> more
> > > >>>>>>> than
> > > >>>>>>>> 50%.
> > > >>>>>>>>>>>> Based on profiling results
of production clusters, we
> > > >>> decided
> > > >>>>>>> to
> > > >>>>>>>>>>>> support EC with the striped
block layout in the first
> > > >>> phase,
> > > >>>> so
> > > >>>>>>>>>>>> that small files can be better
handled. This means
> > > >>>>>>>>>>>> dividing
> > > >>>>>>> each
> > > >>>>>>>>>>>> logical HDFS file block into
smaller units (striping
> > > >>>>>>>>>>>> cells)
> > > >>>> and
> > > >>>>>>>>>>>> spreading them on a set of
DataNodes in round-robin
> > > >>> fashion.
> > > >>>>>>> Parity
> > > >>>>>>>>>>>> cells are generated for each
stripe of original data
> cells.
> > > >>>> We
> > > >>>>>>> have
> > > >>>>>>>>>>>> made changes to NameNode,
client, and DataNode to
> > > >>> generalize
> > > >>>>>>> the
> > > >>>>>>>>>>>> block concept and handle the
mapping between a logical
> > > >>>>>>>>>>>> file
> > > >>>>>>> block
> > > >>>>>>>>>>>> and its internal storage blocks.
For further details
> > > >>>>>>>>>>>> please
> > > >>>> see
> > > >>>>>>> the
> > > >>>>>>>>>>>> design doc on HDFS-7285.
> > > >>>>>>>>>>>> HADOOP-11264 focuses on providing
flexible and
> > > >>>> high-performance
> > > >>>>>>>>>>>> codec calculation support.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> The nightly Jenkins job of
the branch has reported
> > > >>>>>>>>>>>> several successful runs, and
doesn't show new flaky
> > > >>>>>>>>>>>> tests compared
> > > >>>> with
> > > >>>>>>>>>>>> trunk. We have posted several
versions of the test plan
> > > >>>>>>> including
> > > >>>>>>>>>>>> both unit testing and cluster
testing, and have
> > > >>>>>>>>>>>> executed
> > > >>> most
> > > >>>>>>> tests
> > > >>>>>>>>>>>> in the plan. The most basic
functionalities have been
> > > >>>>>>> extensively
> > > >>>>>>>>>>>> tested and verified in several
real clusters with
> > > >>>>>>>>>>>> different hardware configurations;
results have been
> > > >>>>>>>>>>>> very stable. We
> > > >>>> have
> > > >>>>>>>>>>>> created follow-on tasks for
more advanced error
> > > >>>>>>>>>>>> handling
> > > >>> and
> > > >>>>>>>>> optimization under the umbrella HDFS-8031.
> > > >>>>>>>>>>>> We also plan to implement
or harden the integration of
> > > >>>>>>>>>>>> EC
> > > >>>> with
> > > >>>>>>>>>>>> existing features such as
WebHDFS, snapshot, append,
> > > >>>> truncate,
> > > >>>>>>>>>>>> hflush, hsync, and so forth.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Development of this feature
has been a collaboration
> > > >>>>>>>>>>>> across
> > > >>>>>>> many
> > > >>>>>>>>>>>> companies and institutions.
I'd like to thank J.
> > > >>>>>>>>>>>> Andreina,
> > > >>>>>>> Takanobu
> > > >>>>>>>>>>>> Asanuma, Vinayakumar B, Li
Bo, Takuya Fukudome, Uma
> > > >>> Maheswara
> > > >>>>>>> Rao
> > > >>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe,
Xinwei Qin, Rakesh R,
> > > >>>>>>>>>>>> Gao
> > > >>>> Rui,
> > > >>>>>>> Kai
> > > >>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo
Nicholas Sze, Andrew Wang,
> > > >>>>>>>>>>>> Yong
> > > >>>>>>> Zhang,
> > > >>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai
Zheng for their code
> > > >>>> contributions
> > > >>>>>>> and
> > > >>>>>>>>> reviews.
> > > >>>>>>>>>>>> Andrew and Kai Zheng also
made fundamental
> > > >>>>>>>>>>>> contributions to
> > > >>>> the
> > > >>>>>>>>>>>> initial design. Rui Li, Gao
Rui, Kai Sasaki, Kai Zheng
> > > >>>>>>>>>>>> and
> > > >>>> many
> > > >>>>>>>>>>>> other contributors have made
great efforts in system
> > > >>> testing.
> > > >>>>>>> Many
> > > >>>>>>>>>>>> thanks go to Weihua Jiang
for proposing the JIRA, and
> > > >>>>>>>>>>>> ATM,
> > > >>>> Todd
> > > >>>>>>>>>>>> Lipcon, Silvius Rus, Suresh,
as well as many others for
> > > >>>>>>> providing
> > > >>>>>>>>> helpful feedbacks.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Following the community convention,
this vote will last
> > > >>> for 7
> > > >>>>>>> days
> > > >>>>>>>>>>>> (ending September 29th). Votes
from Hadoop committers
> > > >>>>>>>>>>>> are
> > > >>>>>>> binding
> > > >>>>>>>>>>>> but non-binding votes are
very welcome as well. And
> > > >>>>>>>>>>>> here's
> > > >>> my
> > > >>>>>>>>>>>> non-binding
> > > >>>>>>>>>> +1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>> ---
> > > >>>>>>>>>>>> Zhe Zhang
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message