impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nishidha Panpaliya" <>
Subject Re: Contributions to Cloudera Impala
Date Fri, 19 Aug 2016 10:53:40 GMT

Hi Jim,

Thanks a lot for your response.

Please see my comments inline.

Adding Zhi Zhi, our colleague from China in the thread.


From:	Sudarshan Jagadale/Austin/Contr/IBM
To:	Jim Apple <>
Cc:	"dev@impala" <>, Manish
            Patil/Austin/Contr/IBM@IBMUS, Silvius Rus <>,
            Valencia Serrao/Austin/Contr/IBM@IBMUS, Nishidha
Date:	08/18/2016 10:45 AM
Subject:	Re: Contributions to Cloudera Impala

Hi Jim,


Thank you for your inputs...

Thanks and Regards
Sudarshan Jagadale
Power Open Source Solutions

From:	Jim Apple <>
To:	"dev@impala" <>
Cc:	Silvius Rus <>, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, Manish
            Patil/Austin/Contr/IBM@IBMUS, Valencia
Date:	08/17/2016 09:43 PM
Subject:	Re: Contributions to Cloudera Impala

> I'm glad to tell you that we are able to build and test Impala on Ubuntu
> linux ppc64le with the great support from the Cloudera Community.


> Our next action is to upstream all our changes to Cloudera Impala.


Cloudera has donated Impala to the Apache Software Foundation (aka
"ASF"). Cloudera now contributes to the project, and the project is
managed by the Impala community.

> With
> this, our plan is to start building latest Impala on Power8 as we'd been
> porting quite an old version (code from cdh5-trunk branch till 23rd
> 2016). Since then, I know there have been many many changes happened
> are yet to be ported, specially kudu stuff.

Yes, there have been many changes. One is that Impala is now hosted on
ASF-owned git. Please see
[Nishidha] I've read a few pages from this Confluence. Indeed, very useful
to start with.

>    We know we need CLA to be signed to start contributions. We have
>    initiated the process and hoping to get it done soon.

I think the right thing to do here is use the Apache CLAs. See

[Nishidha] We'll start with this.

>     By the time we get CLA signed, we would start porting the changes
>    in last 5 months. So, I wanted to know which tag/branch should we take
>    up for this.

This is a question we could all discuss together, and it might end up
being a decision made by the Project Management Committee (PMC).

This is a big question about how Apache Impala will evolve. Our bylaws

"Significant, pervasive features may be developed in a speculative
branch of the repository. The PMC may grant commit rights on the
branch to its consistent contributors for the duration of the
initiative. Branch committers are responsible for shepherding their
feature into an active release and do not cast binding votes or vetoes
in the project."

So perhaps this should happen on a separate branch?

One question the community should also consider, IMHO, is whether the
community will have sufficient resources to maintain a working ppc64le
codebase indefinitely into the future.
[Nishidha] We found two new source code URLs as one mentioned in Confluence and another to
Commits wise both look same, though former one says "wip" in the URL.
Please suggest the URL to be forked and worked upon. We didn't know if we
could directly work on a separate branch on apache's Impala. We thought of
forking first into our repo and then working on it.

> Working on cdh5-trunk will put us into an unending loop of
>    porting as it is being modified everyday. We are thinking to create a
>    branch from cdh5.8.0-release tag and start working on it. Please
>    us the best way to do this.

Since Impala is now developed on Apache infrastructure, we have
switched branching schemas. Our main branch is now "master". We do not
have any release branches yet.
[Nishidha] Okay. So, after CLA, can we work directly on Apache's Impala or
we'll need to fork it into our repository and create a new branch from
master, and then generate PRs/Gerrit code reviews from it?

>    Verifying all the changes on x86 platforms ourself here will also be
>    time consuming and add potential delays in upstreaming. So, we were
>    thinking if we can get a job on Cloudera's Continuous integration
>    which would simply fetch our branch and verify it on all the supported
>    platforms and do all the required checks. I'm not sure if this is
>    feasible but just a thought. Any other suggestions  to foster this
>    activity would be appreciated.

We are working on making a publicly-available CI setup, but we aren't done

Do you have a CI setup and x86-64 machines that your CI workers can run on?
[Nishidha] Which CI do you have or working on? We can setup Jenkins here
and can get x86-64 machines too. What is the expected timeline for your CI
to be publicly available?

>    For every Pull Request, what are the basic sanity tests required to be
>    ensured? Do you test all BE, FE, End-to-End tests, Custom cluster

Patches are sent to gerrit for review. Before they are merged, all
tests must pass in "core" (but not "exhaustive") mode.
[Nishidha] Sure.

If I were in your shoes, I might take the following steps:

1. Start a discussion on dev@ about whether a new branch is the right
way to develop.

2. Work out long-term maintenance plans and commitments and CI plans

3. Do the arduous work of rebasing on a recent HEAD.
[Nishidha] Thanks for this suggestion. Yes, definitely, I would like to
follow this.
1. Would you suggest me to start a new thread for branch discussion,
although I created a gist in for the
2. For long-term maintenance plans, commitments and CI plans, I would start
a separate thread once we get clarity on above.
3. We would definitely need to rebase on a recent HEAD to submit changes
upstream. (This would be again challenging as kudu, the newly added
dependency is even tougher to build on Power). We have also started looking
at building our native toolchain of Impala Pre-requisites for Power.

  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message