hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 张铎(Duo Zhang) <palomino...@gmail.com>
Subject Re: [DISCUSS] ARM/aarch64 support for Hadoop
Date Tue, 03 Sep 2019 00:49:01 GMT
For HBase, we purged all the protobuf related things from the public API,
and then upgraded to a shaded and relocated version of protobuf. We have
created a repo for this:

https://github.com/apache/hbase-thirdparty

But since the hadoop dependencies still pull in the protobuf 2.5 jars, our
coprocessors are still on protobuf 2.5. Recently we have opened a discuss
on how to deal with the upgrading of coprocessor. Glad to see that the
hadoop community is also willing to solve the problem.

Anu Engineer <aengineer@cloudera.com.invalid> 于2019年9月3日周二 上午1:23写道:

> +1, for the branch idea. Just FYI, Your biggest problem is proving that
> Hadoop and the downstream projects work correctly after you upgrade core
> components like Protobuf.
> So while branching and working on a branch is easy, merging back after you
> upgrade some of these core components is insanely hard. You might want to
> make sure that community buys into upgrading these components in the trunk.
> That way we will get testing and downstream components will notice when
> things break.
>
> That said, I have lobbied for the upgrade of Protobuf for a really long
> time; I have argued that 2.5 is out of support and we cannot stay on that
> branch forever; or we need to take ownership of the Protobuf 2.5 code base.
> It has been rightly pointed to me that while all the arguments I make is
> correct; it is a very complicated task to upgrade Protobuf, and the worst
> part is we will not even know what breaks until downstream projects pick up
> these changes and work against us.
>
> If we work off the Hadoop version 3 — and assume that we have "shading" in
> place for all deployments; it might be possible to get there; still a
> daunting task.
>
> So best of luck with the branch approach — But please remember, Merging
> back will be hard, Just my 2 cents.
>
> — Anu
>
>
>
>
> On Sun, Sep 1, 2019 at 7:40 PM Zhenyu Zheng <zhengzhenyulixi@gmail.com>
> wrote:
>
> > Hi,
> >
> > Thanks Vinaya for bring this up and thanks Sheng for the idea. A separate
> > branch with it's own ARM CI seems a really good idea.
> > By doing this we won't break any of the undergoing development in trunk
> and
> > a CI can be a very good way to show what are the
> > current problems and what have been fixed, it will also provide a very
> good
> > view for contributors that are intrested to working on
> > this. We can finally merge back the branch to trunk until the community
> > thinks it is good enough and stable enough. We can donate
> > ARM machines to the existing CI system for the job.
> >
> > I wonder if this approch possible?
> >
> > BR,
> >
> > On Thu, Aug 29, 2019 at 11:29 AM Sheng Liu <liusheng2048@gmail.com>
> wrote:
> >
> > > Hi,
> > >
> > > Thanks Vinay for bring this up, I am a member of "Openlab" community
> > > mentioned by Vinay. I am working on building and
> > > testing Hadoop components on aarch64 server these days, besides the
> > missing
> > > dependices of ARM platform issues #1 #2 #3
> > > mentioned by Vinay, other similar issue has also be found, such as the
> > > "PhantomJS" dependent package also missing for aarch64.
> > >
> > > To promote the ARM support for Hadoop, we have discussed and hoped to
> add
> > > an ARM specific CI to Hadoop repo. we are not
> > > sure about if there is any potential effect or confilict on the trunk
> > > branch, so maybe creating a ARM specific branch for doing these stuff
> > > is a better choice, what do you think?
> > >
> > > Hope to hear thoughts from you :)
> > >
> > > BR,
> > > Liu sheng
> > >
> > > Vinayakumar B <vinayakumarb@apache.org> 于2019年8月27日周二 上午5:34写道:
> > >
> > > > Hi Folks,
> > > >
> > > > ARM is becoming famous lately in its processing capability and has
> got
> > > the
> > > > potential to run Bigdata workloads.
> > > > Many users have been moving to ARM machines due to its low cost.
> > > >
> > > > In the past there were attempts to compile Hadoop on ARM (Rasberry
> PI)
> > > for
> > > > experimental purposes. Today ARM architecture is taking some of the
> > > > serverside processing as well. So there will be/is a real need of
> > Hadoop
> > > to
> > > > support ARM architecture as well.
> > > >
> > > > There are bunch of users who are trying out building Hadoop on ARM,
> > > trying
> > > > to add ARM CI to hadoop and facing issues[1]. Also some
> > > >
> > > > As of today, Hadoop does not compile on ARM due to below issues,
> found
> > > from
> > > > testing done in openlab in [2].
> > > >
> > > > 1. Protobuf :
> > > > -------------------
> > > >      Hadoop project (also some downstream projects) stuck to protobuf
> > > 2.5.0
> > > > version, due to backward compatibility reasons. Protobuf-2.5.0 is not
> > > being
> > > > maintained in the community. While protobuf 3.x is being actively
> > adopted
> > > > widely, still protobuf 3.x provides wire compatibility for proto2
> > > messages.
> > > > Due to some compilation issues in the generated java code, which can
> > > induce
> > > > problems in downstream. Due to this reason protobuf upgrade from
> 2.5.0
> > > was
> > > > not taken up.
> > > > In 3.0.0 onwards, hadoop supports shading of libraries to avoid
> > classpath
> > > > problem in downstream projects.
> > > >     There are patches available to fix compilation in Hadoop. But
> need
> > to
> > > > find a way to upgrade protobuf to latest version and still maintain
> the
> > > > downstream's classpath using shading feature of Hadoop build.
> > > >
> > > >      There is a Jira for protobuf upgrade[3] created even before
> shade
> > > > support was added to Hadoop. Now need to revisit the Jira and
> continue
> > > > explore possibilities.
> > > >
> > > > 2. leveldbjni:
> > > > ---------------
> > > >     Current leveldbjni used in YARN doesnot support ARM architecture,
> > > need
> > > > to check whether any of the future versions support ARM and can
> hadoop
> > > > upgrade to that version.
> > > >
> > > >
> > > > 3. hadoop-yarn-csi's dependency 'protoc-gen-grpc-java:1.15.1'
> > > > -------------------------
> > > > 'protoc-gen-grpc-java:1.15.1' does not provide ARM executable by
> > default
> > > in
> > > > the maven repository. Workaround is to build it locally and keep in
> > local
> > > > maven repository.
> > > > Need to check whether any future versions of 'protoc-gen-grpc-java'
> is
> > > > having ARM executable and whether hadoop-yarn-csi can upgrade it?
> > > >
> > > >
> > > > Once the compilation issues are solved, then there might be many
> native
> > > > code related issues due to different architectures.
> > > > So to explore everything, need to join hands together and proceed.
> > > >
> > > >
> > > > Let us discuss and check, whether any body else out there who also
> need
> > > the
> > > > support of Hadoop on ARM architectures and ready to lend their hands
> > and
> > > > time in this work.
> > > >
> > > >
> > > > [1] https://issues.apache.org/jira/browse/HADOOP-16358
> > > > [2]
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/HADOOP-16358?focusedCommentId=16904887&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16904887
> > > > [3] https://issues.apache.org/jira/browse/HADOOP-13363
> > > >
> > > > -Vinay
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message