hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Haohui Mai <ricet...@gmail.com>
Subject Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]
Date Mon, 02 Nov 2015 23:02:16 GMT
+1 on putting EC on 2.9.

Is it a good time to start the discussion on the issues of releasing 2.8?

~Haohui

On Mon, Nov 2, 2015 at 1:40 PM, Gangumalla, Uma
<uma.gangumalla@intel.com> wrote:
> +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we plan to
> have 2.8 and 2.9 releases.
>
> Regards,
> Uma
>
> On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vinodkv@hortonworks.com> wrote:
>
>>Forking the thread. Started looking at the 2.8 list, various features¹
>>status and arrived here.
>>
>>While I understand the pervasive nature of EC and a need for a
>>significant bake-in, moving this to a 3.x release is not a good idea. We
>>will surely get a 2.8 out this year and, as needed, I can even spend time
>>getting started on a 2.9. OTOH, 3.x is long ways off, and given all the
>>incompatibilities there, it would be a while before users can get their
>>hands on EC if it were to be only on 3.x. At best, this may force sites
>>that want EC to backport the entire EC feature to older releases, at
>>worst this will be repeat the mess of 0.20 security release forks.
>>
>>If we think adding this to 2.8 (even if it switched off) is too much risk
>>per our original plan, let¹s move this to 2.9, there by leaving enough
>>time for stability, integration testing and bake-in, and a realistic
>>chance of having it end up on users¹ clusters soonish.
>>
>>+Vinod
>>
>>> On Oct 19, 2015, at 1:44 PM, Andrew Wang <andrew.wang@cloudera.com>
>>>wrote:
>>>
>>> I think our plan thus far has been to target this for 3.0. I'm okay with
>>> putting it in branch-2 if we've given a hard look at compatibility, but
>>> I'll note though that 2.8 is already looking like quite a large release,
>>> and our release bandwidth has been focused on the 2.6 and 2.7
>>>maintenance
>>> releases. Adding another multi-hundred JIRAs to 2.8 might make it too
>>> unwieldy to get out the door. If we bump EC past that, 3.0 might very
>>>well
>>> be our next release vehicle. I do plan to revive the 3.0 schedule some
>>>time
>>> next year. With EC and JDK8 in a good spot, the only big feature
>>>remaining
>>> is classpath isolation.
>>>
>>> EC is also a pretty fundamental change to HDFS. Even if it's
>>>compatible, in
>>> terms of size and impact it might best belong in a new major release.
>>>
>>> Best,
>>> Andrew
>>>
>>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
>>> vinayakumarb.apache@gmail.com> wrote:
>>>
>>>> Is anyone else also thinks that feature is ready to goto branch-2  as
>>>>well?
>>>>
>>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since
>>>>then and
>>>> ready to go in branch-2.
>>>>
>>>> -Vinay
>>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zhezhang@cloudera.com> wrote:
>>>>
>>>>> Thanks Vinay for capturing the issue and Uma for offering the help.
>>>>>
>>>>> ---
>>>>> Zhe Zhang
>>>>>
>>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
>>>> uma.gangumalla@intel.com
>>>>>>
>>>>> wrote:
>>>>>
>>>>>> Vinay,
>>>>>>
>>>>>>
>>>>>> I would merge them as part of HDFS-9182.
>>>>>>
>>>>>> Thanks,
>>>>>> Uma
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vinayakumarb@apache.org>
>>>>>>wrote:
>>>>>>
>>>>>>> Hi Andrew,
>>>>>>> I see CHANGES.txt entries not yet merged from
>>>> CHANGES-HDFS-EC-7285.txt.
>>>>>>>
>>>>>>> Was this intentional?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vinay
>>>>>>>
>>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
>>>> andrew.wang@cloudera.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Branch has been merged to trunk, thanks again to everyone
who
>>>>>>>>worked
>>>>> on
>>>>>>>> the
>>>>>>>> feature!
>>>>>>>>
>>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zhezhang@cloudera.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks everyone who has participated in this discussion.
>>>>>>>>>
>>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1,
this vote
>>>> has
>>>>>>>> passed.
>>>>>>>>> I will do a final 'git merge' with trunk and work with
Andrew to
>>>>> merge
>>>>>>>> the
>>>>>>>>> branch to trunk. I'll update on this thread when the
merge is
>>>> done.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>> Zhe Zhang
>>>>>>>>>
>>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi.a.liu@intel.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> (Change it to binding.)
>>>>>>>>>>
>>>>>>>>>> +1
>>>>>>>>>> I have been involved in the development and code
review on the
>>>>>>>> feature
>>>>>>>>>> branch. It's a great feature and I think it's ready
to merge it
>>>>> into
>>>>>>>>> trunk.
>>>>>>>>>>
>>>>>>>>>> Thanks all for the contribution.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Yi Liu
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Liu, Yi A
>>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding)
branch to
>>>>> trunk
>>>>>>>>>>
>>>>>>>>>> +1 (non-binding)
>>>>>>>>>> I have been involved in the development and code
review on the
>>>>>>>> feature
>>>>>>>>>> branch. It's a great feature and I think it's ready
to merge it
>>>>> into
>>>>>>>>> trunk.
>>>>>>>>>>
>>>>>>>>>> Thanks all for the contribution.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Yi Liu
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Vinayakumar B [mailto:vinayakumarb@apache.org]
>>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
>>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding)
branch to
>>>>> trunk
>>>>>>>>>>
>>>>>>>>>> +1,
>>>>>>>>>>
>>>>>>>>>> I've been involved starting from design and development
of
>>>>>>>> ErasureCoding.
>>>>>>>>>> I think phase 1 of this development is ready to be
merged to
>>>>> trunk.
>>>>>>>>>> It had come a long way to the current state with
significant
>>>>> effort
>>>>>>>> of
>>>>>>>>>> many Contributors and Reviewers for both design and
code.
>>>>>>>>>>
>>>>>>>>>> Thanks Everyone for the efforts.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Vinay
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <jing9@apache.org>
>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> +1
>>>>>>>>>>>
>>>>>>>>>>> I've been involved in both development and review
on the
>>>> branch,
>>>>>>>> and
>>>>>>>> I
>>>>>>>>>>> believe it's now ready to get merged into trunk.
Many thanks
>>>> to
>>>>>>>> all
>>>>>>>>>>> the contributors and reviewers!
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -Jing
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
>>>>> kai.zheng@intel.com>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Non-binding +1
>>>>>>>>>>>>
>>>>>>>>>>>> According to our extensive performance tests,
striping +
>>>> ISA-L
>>>>>>>> coder
>>>>>>>>>>> based
>>>>>>>>>>>> erasure coding not only can save storage,
but also can
>>>>> increase
>>>>>>>> the
>>>>>>>>>>>> throughput of a client or a cluster. It will
be a great
>>>>>>>> addition to
>>>>>>>>>>>> HDFS and its users. Based on the latest branch
codes, we
>>>> also
>>>>>>>>>>>> observed it's
>>>>>>>>>>> very
>>>>>>>>>>>> reliable in the concurrent tests. We'll provide
the perf
>>>> test
>>>>>>>> report
>>>>>>>>>>> after
>>>>>>>>>>>> it's sorted out and hope it helps.
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Kai
>>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.gangumalla@intel.com]
>>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50
AM
>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
>>>> common-dev@hadoop.apache.org
>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure
coding) branch
>>>> to
>>>>>>>> trunk
>>>>>>>>>>>>
>>>>>>>>>>>> +1
>>>>>>>>>>>>
>>>>>>>>>>>> Great addition to HDFS. Thanks all contributors
for the nice
>>>>>>>> work.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Uma
>>>>>>>>>>>>
>>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezhang@cloudera.com>
>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'd like to propose a vote to merge the
HDFS-7285 feature
>>>>>>>> branch
>>>>>>>>>>>>> back to trunk. Since November 2014 we
have been designing
>>>> and
>>>>>>>>>>>>> developing this feature under the umbrella
JIRAs HDFS-7285
>>>>> and
>>>>>>>>>>>>> HADOOP-11264, and have committed approximately
210 patches.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The HDFS-7285 feature branch was created
to support the
>>>> first
>>>>>>>> phase
>>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The
objective of HDFS-EC
>>>> is
>>>>>>>> to
>>>>>>>>>>>>> significantly reduce storage space usage
in HDFS clusters.
>>>>>>>> Instead
>>>>>>>>>>>>> of always creating 3 replicas of each
block with 200%
>>>> storage
>>>>>>>> space
>>>>>>>>>>>>> overhead, HDFS-EC provides data durability
through parity
>>>>> data
>>>>>>>>> blocks.
>>>>>>>>>>>>> With most EC configurations, the storage
overhead is no
>>>> more
>>>>>>>> than
>>>>>>>>> 50%.
>>>>>>>>>>>>> Based on profiling results of production
clusters, we
>>>> decided
>>>>>>>> to
>>>>>>>>>>>>> support EC with the striped block layout
in the first
>>>> phase,
>>>>> so
>>>>>>>>>>>>> that small files can be better handled.
This means dividing
>>>>>>>> each
>>>>>>>>>>>>> logical HDFS file block into smaller
units (striping cells)
>>>>> and
>>>>>>>>>>>>> spreading them on a set of DataNodes
in round-robin
>>>> fashion.
>>>>>>>> Parity
>>>>>>>>>>>>> cells are generated for each stripe of
original data cells.
>>>>> We
>>>>>>>> have
>>>>>>>>>>>>> made changes to NameNode, client, and
DataNode to
>>>> generalize
>>>>>>>> the
>>>>>>>>>>>>> block concept and handle the mapping
between a logical file
>>>>>>>> block
>>>>>>>>>>>>> and its internal storage blocks. For
further details please
>>>>> see
>>>>>>>> the
>>>>>>>>>>>>> design doc on HDFS-7285.
>>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible
and
>>>>> high-performance
>>>>>>>>>>>>> codec calculation support.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The nightly Jenkins job of the branch
has reported several
>>>>>>>>>>>>> successful runs, and doesn't show new
flaky tests compared
>>>>> with
>>>>>>>>>>>>> trunk. We have posted several versions
of the test plan
>>>>>>>> including
>>>>>>>>>>>>> both unit testing and cluster testing,
and have executed
>>>> most
>>>>>>>> tests
>>>>>>>>>>>>> in the plan. The most basic functionalities
have been
>>>>>>>> extensively
>>>>>>>>>>>>> tested and verified in several real clusters
with different
>>>>>>>>>>>>> hardware configurations; results have
been very stable. We
>>>>> have
>>>>>>>>>>>>> created follow-on tasks for more advanced
error handling
>>>> and
>>>>>>>>>> optimization under the umbrella HDFS-8031.
>>>>>>>>>>>>> We also plan to implement or harden the
integration of EC
>>>>> with
>>>>>>>>>>>>> existing features such as WebHDFS, snapshot,
append,
>>>>> truncate,
>>>>>>>>>>>>> hflush, hsync, and so forth.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Development of this feature has been
a collaboration across
>>>>>>>> many
>>>>>>>>>>>>> companies and institutions. I'd like
to thank J. Andreina,
>>>>>>>> Takanobu
>>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya
Fukudome, Uma
>>>> Maheswara
>>>>>>>> Rao
>>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei
Qin, Rakesh R, Gao
>>>>> Rui,
>>>>>>>> Kai
>>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze,
Andrew Wang, Yong
>>>>>>>> Zhang,
>>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for
their code
>>>>> contributions
>>>>>>>> and
>>>>>>>>>> reviews.
>>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental
contributions to
>>>>> the
>>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai
Sasaki, Kai Zheng and
>>>>> many
>>>>>>>>>>>>> other contributors have made great efforts
in system
>>>> testing.
>>>>>>>> Many
>>>>>>>>>>>>> thanks go to Weihua Jiang for proposing
the JIRA, and ATM,
>>>>> Todd
>>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well
as many others for
>>>>>>>> providing
>>>>>>>>>> helpful feedbacks.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Following the community convention, this
vote will last
>>>> for 7
>>>>>>>> days
>>>>>>>>>>>>> (ending September 29th). Votes from Hadoop
committers are
>>>>>>>> binding
>>>>>>>>>>>>> but non-binding votes are very welcome
as well. And here's
>>>> my
>>>>>>>>>>>>> non-binding
>>>>>>>>>>> +1.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> Zhe Zhang
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>
>

Mime
View raw message