hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhanwei Wang <wan...@apache.org>
Subject Re: libhdfs3 development is still going on outside of ASF
Date Thu, 15 Sep 2016 06:38:10 GMT
Hi Roman

I think I have discussed enough about the benefit and drawback of merge two independent project
together. 
Let me propose a way to see if it can make both ASF and libhdfs3’s user happy. And I need
your advise.


Is it possibile to have two git repository in ASF for HAWQ incubator project. If it is possible,
I propose to solve the libhdfs3 issue like this.

1) create a new git repository in ASF and push all libhdfs3’s code and branch from Github
to ASF.
2) make libhdfs3’s Github repository as read only mirror of ASF repository. Maybe need to
transfer current owner of Github repository from Pivotal to ASF on Github.
3) HAWQ keep the stable version code of libhdfs3 or just Git reference.


In this way, we keep libhdfs3 independent and keep its all pull request, wiki, issues and
history. And most importantly libhdfs3 can follow ASF rules and process. People can file pull
request on Github and commit to ASF repository and eventually mirror to Github.

 
Any comments?


Best Regards

Zhanwei Wang
wangzw@apache.org



> 在 2016年9月15日,下午2:19,Zhanwei Wang <wangzw@apache.org> 写道:
> 
>> Open source is about community first.
> 
> Good point Kyle. I strongly agree with you!
> 
> But unfortunately seems no one in this thread care about libhdfs3’s community (users)
except me. Positively ignore the frustration of libhdfs3 users and about to delete it’s
repository.
> 
> 
> So let’s set the tone of this thread.
> 
> If we remove libhdfs3’s repository or make it read only:
>  a. What benefit we can get for BOTH HAWQ and libhdfs3’s users?
>  b. What drawback for BOTH HAWQ and libhdfs3’s users?
> 
> 
> 
> The following is my answer.
> 
> a. Benefit: For HAWQ, seems ASF govern its property with ASF rules.  For libhdfs3’s
users, none.
> 
> b. Drawback: For HAWQ, not relevant commits will come into HAWQ’s commit log. JIRA
and pull request will be fired in HAWQ but not related to HAWQ.  Furthermore commit in libhdfs3
may break HAWQ and it’s hard to debug, I have experienced it enough. It is important to
use the stable version of libhdfs3, HAWQ code should only keep the stable version of libhdfs3.
> 
>    For libhdfs3’s user, they have to ask question in HAWQ’s community. They have
to clone entire HAWQ to build libhdfs3 and contribute.
> 
> Let’s think about more. How we schedule a release of libhdfs3 when HAWQ is under developing?
Should we branch HAWQ for libhdfs3’s release? Should we merge libhdfs3’s pull request
when we are releasing HAWQ? Do we have to sync the release process of HAWQ and libhdfs3 and
how?
> 
> Maybe we should better involve libhdfs3’s users into this thread. But unfortunately
they are not in HAWQ’s mail list. See, this is another big issue. We discuss dropping libhdfs3’s
repository in HAWQ’s mail list without libhdfs3’s users involved, seems odd. Image this,
one day the repository you are working with is gone and you even do not know this discuss.
> 
> If anyone want to discuss if we should dropping libhdfs3’s repository, the better place
is libhdfs3’s repository.
> 
> In general merge two independent project together introduce more trouble than benefit.

> 
> To be clear, I’m not against ASF rule. I’m deeply understand the importance of it.
Is there any way to make HAWQ and libhdfs3 separated and make both ASF and libhdfs3’s user
happy? Just like Kyle said, “HOW” is more important. 
> 
> @Roman, your mentoring is important.
> 
> 
> Any comments?
> 
> 
> Best Regards
> 
> Zhanwei Wang
> wangzw@apache.org
> 
> 
> 
>> 在 2016年9月15日,下午12:54,Kyle Dunn <kdunn@pivotal.io> 写道:
>> 
>> Chiming in here only as a casual but concerned observer.
>> 
>> Open source is about community first. If the logistics around "where"
>> libhdfs3 lives rather than the much more important issue of "how" it lives
>> are the focus here, I think we've missed the real issue.
>> 
>> For what it's worth, I concur with others, let's move it to HAWQ
>> exclusively and move on to addressing the community, starting with the
>> decision being made and how/where future contributions can be made.
>> 
>> My brief scan of libhdfs3 shows numerous open pull requests (with
>> apparently useful contributions) and several loose ends "issues". We need
>> to communicate effectively to these contributors whether those PRs and
>> issues are valuable and relevant. This type of engagement is what OSS
>> projects live and die by. We need to be better, starting with libhdfs3,
>> into HAWQ, and beyond.
>> 
>> "Open source isn't someone else's job" - it's everyone's job. I'm
>> challenging everyone with commit responsibly on repos to value community
>> input (both code and issues) as highly as your own backlog. Pay it forward
>> and maybe the community will start shrinking your backlog unexpectedly.
>> 
>> 
>> -Kyle
>> 
>> On Wed, Sep 14, 2016, 21:33 Lei Chang <chang.lei.cn@gmail.com> wrote:
>> 
>>> 
>>> There was a short discussion before when we moved libhfds3 to HAWQ repo.
>>> 
>>> http://mail-archives.apache.org/mod_mbox/incubator-hawq-dev/201602.mbox/%3cCAE44UQe1xgcVOC76T_mgVbgGbR=Lx=XUBPVw18ZK4iZ3euCH+g@mail.gmail.com%3e
>>> I think it makes sense to keep libhdfs3 only in HAWQ repo to simplify
>>> Apache build and releases in current phase. This is what we have done in
>>> the past. But looks not everyone is on the same page.
>>> CheersLei
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Thu, Sep 15, 2016 at 11:12 AM +0800, "Greg Chase" <greg@gregchase.com>
>>> wrote:
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Its fine if libhdfs3 is a third party license, and is treated that way.
>>> 
>>> However, why does Apache HAWQ want to be dependent on some strange 3rd
>>> party library with no transparency?
>>> 
>>> We are having enough difficulties just getting our first release out.
>>> 
>>> Is there a compelling reason why we need to keep up with the independently
>>> developed libhdfs3 project?  Are they willing to make necessary changes so
>>> that they are compatible with ASF's strict-for-a-good-reason policies?
>>> 
>>> Can we fork hdfs3 for Apache HAWQ's purposes in Apache?
>>> 
>>> If any libhdfs3 committers are also part of Apache HAWQ, perhaps you can
>>> shed some light on the viability of this as an independent project since I
>>> only see 4 contributors.
>>> 
>>> -Greg
>>> 
>>> On Wed, Sep 14, 2016 at 7:54 PM, Hong Wu  wrote:
>>> 
>>>> In my opinion, I think it is reasonable to transfer the third-party repo
>>> of
>>>> libhdfs3 totally into HAWQ, not only for the convenience of HAWQ build,
>>> but
>>>> also for the consideration of ASF project. So for HAWQ project, I am with
>>>> Roman.
>>>> 
>>>> But my concern is the current users of libhdfs3 and all the pull
>>> requests,
>>>> wiki docs and issues. Another uncertain aspect from my perspective is
>>> that
>>>> although HAWQ could not run without libhdfs3, libhdfs3 could be used in
>>>> other open source projects, that might be the true meaning of making
>>>> libhdfs3 open source at the beginning.
>>>> 
>>>> In summary, if it is really against the spirit of a ASF project for
>>> HAWQ, a
>>>> suggested way might be marking original libhdfs3 repo as a legacy repo in
>>>> stead of remove it.
>>>> 
>>>> Best
>>>> Hong
>>>> 
>>>> 2016-09-15 10:04 GMT+08:00 Zhanwei Wang :
>>>> 
>>>>> Currently libhdfs3’s official code is not the same as in HAWQ. Some
new
>>>>> code does not copy into HAWQ.  I do not think code change of libhdfs3
>>>>> should follow HAWQ’s commit process because  many change are not
>>> related
>>>> to
>>>>> HAWQ.
>>>>> 
>>>>> From HAWQ side, I suggest to keep the stable version of its third-party
>>>>> libraries and copy new libhdfs3’s code only when it is necessary.
>>>>> 
>>>>> libhdfs3 was open source years before HAWQ incubating with a separated
>>>>> permission of its authority. So in my opinion it is a third party and
>>> it
>>>>> actually was a third party before HAWQ incubating. And HAWQ is not the
>>>> only
>>>>> user.
>>>>> 
>>>>> 
>>>>> 
>>>>> Best Regards
>>>>> 
>>>>> Zhanwei Wang
>>>>> wangzw@apache.org
>>>>> 
>>>>> 
>>>>> 
>>>>>> 在 2016年9月15日,上午9:35,Roman Shaposhnik  写道:
>>>>>> 
>>>>>> On Wed, Sep 14, 2016 at 6:29 PM, Zhanwei Wang
>>>> wrote:
>>>>>>> Hi Roman
>>>>>>> 
>>>>>>> libhdfs3 works as third-party library of HAWQ, Just for the
>>>> convenience
>>>>> of HAWQ release
>>>>>>> process we copy its code into HAWQ.  The reason is that HAWQ
used to
>>>>> dependent on
>>>>>>> specific version of libhdfs3 and libhdfs3 only distribute as
source
>>>>> code and the build process is complicated.
>>>>>> 
>>>>>> I actually don't buy this argument. libhdfs3 is not an optional
>>>>>> dependency for HAWQ
>>>>>> like ORCA is (for example). Without libhdfs3 there's pretty tough
to
>>>>>> imagine HAWQ.
>>>>>> As such the code base needs to be governed as part of the ASF
>>> project,
>>>>>> not a random
>>>>>> GitHub dependency.
>>>>>> 
>>>>>> IOW, let me ask you this: were all the changes that went into
>>> libhdfs3
>>>>>> that is part of
>>>>>> HAWQ discussed and reviewed via the ASF development process or did
>>> you
>>>>> just
>>>>>> import them from time to time as this comment suggests:
>>>>>>  https://issues.apache.org/jira/browse/HAWQ-1046?
>>>>> focusedCommentId=15489669&page=com.atlassian.jira.
>>>>> plugin.system.issuetabpanels:comment-tabpanel#comment-15489669
>>>>>> ?
>>>>>> 
>>>>>>> I do not think we have any reason to shutdown a third party’s
>>> official
>>>>> repository.
>>>>>> 
>>>>>> You say 3d party as though its not just you guys maintaining it on
>>> the
>>>>> side.
>>>>>> 
>>>>>>> We also copy google test source code into HAWQ, just as what
we did
>>>> for
>>>>> libhdfs3.
>>>>>> 
>>>>>> But this is very different. You don't do any development (certainly
>>>>>> you don't do any
>>>>>> non-trivial development) of that code.
>>>>>> 
>>>>>>> libhdfs3 open source under Apache license version 2 just the
same as
>>>>> HAWQ. So I believe there is no license issue.
>>>>>> 
>>>>>> You're correct. There's no licensing issue but there's a pretty
>>>>> significant
>>>>>> governance issue.
>>>>>> 
>>>>>> Thanks,
>>>>>> Roman.
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>> *Kyle Dunn | Data Engineering | Pivotal*
>> Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message