hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ayush Saxena <ayush...@gmail.com>
Subject Re: [DISCUSS] ARM/aarch64 support for Hadoop
Date Wed, 04 Sep 2019 08:39:10 GMT
Thanx Vinay for the initiative, Makes sense to add support for different architectures.

+1, for the branch idea.
Good Luck!!!

-Ayush

> On 03-Sep-2019, at 6:19 AM, 张铎(Duo Zhang) <palomino219@gmail.com> wrote:
> 
> For HBase, we purged all the protobuf related things from the public API,
> and then upgraded to a shaded and relocated version of protobuf. We have
> created a repo for this:
> 
> https://github.com/apache/hbase-thirdparty
> 
> But since the hadoop dependencies still pull in the protobuf 2.5 jars, our
> coprocessors are still on protobuf 2.5. Recently we have opened a discuss
> on how to deal with the upgrading of coprocessor. Glad to see that the
> hadoop community is also willing to solve the problem.
> 
> Anu Engineer <aengineer@cloudera.com.invalid> 于2019年9月3日周二 上午1:23写道:
> 
>> +1, for the branch idea. Just FYI, Your biggest problem is proving that
>> Hadoop and the downstream projects work correctly after you upgrade core
>> components like Protobuf.
>> So while branching and working on a branch is easy, merging back after you
>> upgrade some of these core components is insanely hard. You might want to
>> make sure that community buys into upgrading these components in the trunk.
>> That way we will get testing and downstream components will notice when
>> things break.
>> 
>> That said, I have lobbied for the upgrade of Protobuf for a really long
>> time; I have argued that 2.5 is out of support and we cannot stay on that
>> branch forever; or we need to take ownership of the Protobuf 2.5 code base.
>> It has been rightly pointed to me that while all the arguments I make is
>> correct; it is a very complicated task to upgrade Protobuf, and the worst
>> part is we will not even know what breaks until downstream projects pick up
>> these changes and work against us.
>> 
>> If we work off the Hadoop version 3 — and assume that we have "shading" in
>> place for all deployments; it might be possible to get there; still a
>> daunting task.
>> 
>> So best of luck with the branch approach — But please remember, Merging
>> back will be hard, Just my 2 cents.
>> 
>> — Anu
>> 
>> 
>> 
>> 
>> On Sun, Sep 1, 2019 at 7:40 PM Zhenyu Zheng <zhengzhenyulixi@gmail.com>
>> wrote:
>> 
>>> Hi,
>>> 
>>> Thanks Vinaya for bring this up and thanks Sheng for the idea. A separate
>>> branch with it's own ARM CI seems a really good idea.
>>> By doing this we won't break any of the undergoing development in trunk
>> and
>>> a CI can be a very good way to show what are the
>>> current problems and what have been fixed, it will also provide a very
>> good
>>> view for contributors that are intrested to working on
>>> this. We can finally merge back the branch to trunk until the community
>>> thinks it is good enough and stable enough. We can donate
>>> ARM machines to the existing CI system for the job.
>>> 
>>> I wonder if this approch possible?
>>> 
>>> BR,
>>> 
>>>> On Thu, Aug 29, 2019 at 11:29 AM Sheng Liu <liusheng2048@gmail.com>
>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Thanks Vinay for bring this up, I am a member of "Openlab" community
>>>> mentioned by Vinay. I am working on building and
>>>> testing Hadoop components on aarch64 server these days, besides the
>>> missing
>>>> dependices of ARM platform issues #1 #2 #3
>>>> mentioned by Vinay, other similar issue has also be found, such as the
>>>> "PhantomJS" dependent package also missing for aarch64.
>>>> 
>>>> To promote the ARM support for Hadoop, we have discussed and hoped to
>> add
>>>> an ARM specific CI to Hadoop repo. we are not
>>>> sure about if there is any potential effect or confilict on the trunk
>>>> branch, so maybe creating a ARM specific branch for doing these stuff
>>>> is a better choice, what do you think?
>>>> 
>>>> Hope to hear thoughts from you :)
>>>> 
>>>> BR,
>>>> Liu sheng
>>>> 
>>>> Vinayakumar B <vinayakumarb@apache.org> 于2019年8月27日周二 上午5:34写道:
>>>> 
>>>>> Hi Folks,
>>>>> 
>>>>> ARM is becoming famous lately in its processing capability and has
>> got
>>>> the
>>>>> potential to run Bigdata workloads.
>>>>> Many users have been moving to ARM machines due to its low cost.
>>>>> 
>>>>> In the past there were attempts to compile Hadoop on ARM (Rasberry
>> PI)
>>>> for
>>>>> experimental purposes. Today ARM architecture is taking some of the
>>>>> serverside processing as well. So there will be/is a real need of
>>> Hadoop
>>>> to
>>>>> support ARM architecture as well.
>>>>> 
>>>>> There are bunch of users who are trying out building Hadoop on ARM,
>>>> trying
>>>>> to add ARM CI to hadoop and facing issues[1]. Also some
>>>>> 
>>>>> As of today, Hadoop does not compile on ARM due to below issues,
>> found
>>>> from
>>>>> testing done in openlab in [2].
>>>>> 
>>>>> 1. Protobuf :
>>>>> -------------------
>>>>>     Hadoop project (also some downstream projects) stuck to protobuf
>>>> 2.5.0
>>>>> version, due to backward compatibility reasons. Protobuf-2.5.0 is not
>>>> being
>>>>> maintained in the community. While protobuf 3.x is being actively
>>> adopted
>>>>> widely, still protobuf 3.x provides wire compatibility for proto2
>>>> messages.
>>>>> Due to some compilation issues in the generated java code, which can
>>>> induce
>>>>> problems in downstream. Due to this reason protobuf upgrade from
>> 2.5.0
>>>> was
>>>>> not taken up.
>>>>> In 3.0.0 onwards, hadoop supports shading of libraries to avoid
>>> classpath
>>>>> problem in downstream projects.
>>>>>    There are patches available to fix compilation in Hadoop. But
>> need
>>> to
>>>>> find a way to upgrade protobuf to latest version and still maintain
>> the
>>>>> downstream's classpath using shading feature of Hadoop build.
>>>>> 
>>>>>     There is a Jira for protobuf upgrade[3] created even before
>> shade
>>>>> support was added to Hadoop. Now need to revisit the Jira and
>> continue
>>>>> explore possibilities.
>>>>> 
>>>>> 2. leveldbjni:
>>>>> ---------------
>>>>>    Current leveldbjni used in YARN doesnot support ARM architecture,
>>>> need
>>>>> to check whether any of the future versions support ARM and can
>> hadoop
>>>>> upgrade to that version.
>>>>> 
>>>>> 
>>>>> 3. hadoop-yarn-csi's dependency 'protoc-gen-grpc-java:1.15.1'
>>>>> -------------------------
>>>>> 'protoc-gen-grpc-java:1.15.1' does not provide ARM executable by
>>> default
>>>> in
>>>>> the maven repository. Workaround is to build it locally and keep in
>>> local
>>>>> maven repository.
>>>>> Need to check whether any future versions of 'protoc-gen-grpc-java'
>> is
>>>>> having ARM executable and whether hadoop-yarn-csi can upgrade it?
>>>>> 
>>>>> 
>>>>> Once the compilation issues are solved, then there might be many
>> native
>>>>> code related issues due to different architectures.
>>>>> So to explore everything, need to join hands together and proceed.
>>>>> 
>>>>> 
>>>>> Let us discuss and check, whether any body else out there who also
>> need
>>>> the
>>>>> support of Hadoop on ARM architectures and ready to lend their hands
>>> and
>>>>> time in this work.
>>>>> 
>>>>> 
>>>>> [1] https://issues.apache.org/jira/browse/HADOOP-16358
>>>>> [2]
>> https://issues.apache.org/jira/browse/HADOOP-16358?focusedCommentId=16904887&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16904887
>>>>> [3] https://issues.apache.org/jira/browse/HADOOP-13363
>>>>> 
>>>>> -Vinay
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Mime
View raw message