hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Travis Crawford <traviscrawf...@gmail.com>
Subject Re: [DISCUSS] HCatalog becoming a subproject of Hive
Date Fri, 14 Dec 2012 01:31:47 GMT
Thanks for reviving this thread. Reviewing the comments everyone seems
to agree HCatalog makes sense as a Hive subproject. I think that's
great news for the Hadoop community.

The discussion seems to have turned to one of committer permissions. I
agree with the Hive folks sentiment that its something that must be
earned. That said, I've found it challenging at times getting patches
into Hive that would help earn taking on a hive committer
responsibility.

Proposal: if a couple hive committers can volunteer to be hcat
shepherds, we can work with the shepherds when making hive changes in
a timely manor. Conversely, we can help shepherd any hive committers
who are interested in working more with hcat. There are certainly
benefits to cross-committership, and this approach could help each
other build a history of meaningful contributions and earn the
privilege & responsibility of being committers.

Thoughts?

--travis



On Thu, Dec 13, 2012 at 11:59 AM, Edward Capriolo <edlinuxguru@gmail.com> wrote:
> I initially was a hesitant of hcatalog mostly because I imagined we would
> end up in a spot very similar to this.
>
> Namely the hcatlog folks are interested in making a metastore to support
> pig, hive, and map reduce. However I get the impression that many in hive
> do not care much to have a metastore that caters to everyone. Their needs
> are only based on what hive needs. Which I believe is the wrong way to look
> at this situation.
>
> I though to reply to this thread because I have been following this Jira:
> https://issues.apache.org/jira/browse/HIVE-3752
>
> On a high level I do not like this duplication of effort and code. If hive
> is compatible with hcatalog I do not see why we put off merging the two at
> all. Hive users would get an immediate benefit if Hive used hcatalog with
> no apparent downside. Meanwhile we are putting this off and staying in this
> awkward transition phase.
>
> Personally, I do not have a problem being a hive committer and not having
> hcatalog commit. None of the hive work I have done has ever touched the
> metastore. Also of the thousands of jiras and features we have added only a
> small portion require metastore changes.
>
> As long as a couple active users have commit on hive and the suggested
> hcatalog subproject I do not think not having commit will be a roadblock in
> moving hive forward.
>
>
> On Mon, Dec 3, 2012 at 6:22 PM, Alan Gates <gates@hortonworks.com> wrote:
>
>> I am not sure where we are on this discussion.  So far those who have
>> chimed in seemed generally positive (Namit, Edward, Clark, Alexander).
>>  Namit and I have different visions for what the committership might look
>> like, so I'd like to hear from other Hive PMC members what their view is on
>> this.  I have to say from an HCatalog perspective the proposition is much
>> less attractive without some commit rights.
>>
>> On a related note, people should be aware of these threads in the
>> Incubator list:
>>
>> http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/%3CCAGU5spdWHNtJxgQ8f%3DnPEXx9xNLjyjOYaFfnSw4EyAjgm1c46w%40mail.gmail.com%3E
>>
>>
>> http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/%3CCAKQbXgDZj_zMj4qSodXjMHV7xQZxpcY1-35cvq959YKLNd6tJQ%40mail.gmail.com%3E
>>
>> For those not inclined to read all the mails in the threads I will
>> summarize (though I urge all PMC members of Hive and PPMC members of HCat
>> to read both mail threads because this is highly relevant to what we are
>> discussing).  There are two salient points in these threads:
>>
>> 1) It is not wise to build a subproject that is distinct from the main
>> project in the sense that it has separate community members interested in
>> it.  Bertrand, Arun, Chris Mattman, and Greg Stein all spoke against this,
>> and all are long time Apache contributors with a lot of experience.  They
>> were all of the opinion that it was reasonable for one project to release
>> separate products.
>>
>> 2) It is not wise to have committers that have access to parts of a
>> project but not others.  Greg and Bertrand argued (and Arun seemed to
>> imply) that splitting up committer lists by sections of the code did not
>> work out well.
>>
>> These insights cause me to question what we mean by subproject.  I had
>> originally envisioned something that looked like Pig and Hive did when they
>> were subprojects of Hadoop.  But this violates both 1 and 2 above.  Given
>> this input from many of the "wise old timers" of Apache I think we should
>> consider what we mean when we say subproject and how tightly we are willing
>> to integrate these projects.  Personally I think it makes sense to continue
>> to pursue integration, as I think HCat is really a set of interfaces on top
>> of Hive and it makes sense to coalesce those into one project.  I guess
>> this would mean HCat becomes just another set of jars that Hive releases
>> when it releases, rather than a stand alone entity.  But I'm curious to
>> hear what others think.
>>
>> Alan.
>>
>> On Nov 14, 2012, at 10:22 PM, Namit Jain wrote:
>>
>> > The same criteria should be applied to all Hive committers. Only a
>> > committer should be able to commit code.
>> > I donĀ¹t think we should bend this rule. Metastore is not a separate
>> > project, but a integral part of hive.
>> >
>> > -namit
>> >
>> >
>> > On 11/12/12 10:32 PM, "Alan Gates" <gates@hortonworks.com> wrote:
>> >
>> >> I would suggest looking over the patch history of HCat committers.  I
>> >> think most of them have already contributed a number of patches to the
>> >> metastore.  All are certainly aware of how to run Hive unit tests and
>> >> have an understanding of how Hive works.  So I don't think it's fair to
>> >> say they would be unsafe with access to the metastore.  And the Hive PMC
>> >> is there to assure this does not happen.  If there are issues I am sure
>> >> they can deal with them.
>> >>
>> >> Alan.
>> >>
>> >>
>> >> On Nov 6, 2012, at 8:06 PM, Namit Jain wrote:
>> >>
>> >>> Alan, that would not be a good idea. Metastore code is part of hive
>> >>> code,
>> >>> and it
>> >>> would be safer if only Hive committers had commit access to that.
>> >>>
>> >>>
>> >>> On 11/6/12 11:25 PM, "Alan Gates" <gates@hortonworks.com> wrote:
>> >>>
>> >>>>
>> >>>> On Nov 4, 2012, at 8:35 PM, Namit Jain wrote:
>> >>>>
>> >>>>> I like the idea of Hcatalog becoming a Hive sub-project. The
>> >>>>> enhancements/bugs in the serde/metastore areas can indirectly
>> >>>>> benefit the hive community, and it will be easier for the fix
to be
>> in
>> >>>>> one
>> >>>>> place. Having said that, I don't see serde/metastore
>> >>>>> moving out of hive into a separate component. Things are tied
too
>> >>>>> closely
>> >>>>> together. I am assuming that no new committers would
>> >>>>> be automatically added to Hive as part of this, and both Hive
and
>> >>>>> HCatalog
>> >>>>> will continue to have its own committers.
>> >>>>
>> >>>> One thing in this we'd like to discuss is the HCatalog committers
>> >>>> having
>> >>>> commit access to the metastore sections of Hive code.  That doesn't
>> >>>> mean
>> >>>> it has to move into HCatalog's code base.  But more and more the
fixes
>> >>>> and changes we're doing in HCatalog are really in Hive's metastore.
>>  So
>> >>>> we believe it would make sense to give HCat committers access to
that
>> >>>> component as well as HCat.
>> >>>>
>> >>>> Alan.
>> >>>>
>> >>>>>
>> >>>>> Thanks,
>> >>>>> -namit
>> >>>>>
>> >>>>>
>> >>>>> On 11/3/12 2:22 AM, "Alan Gates" <gates@hortonworks.com>
wrote:
>> >>>>>
>> >>>>>> Hello Hive community.  It is time for HCatalog to graduate
from the
>> >>>>>> Apache Incubator.  Given the heavy dependence of HCatalog
on Hive
>> the
>> >>>>>> HCatalog community agreed it made sense to explore graduating
from
>> >>>>>> the
>> >>>>>> Incubator to become a subproject of Hive (see
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/20120
>> >>>>>> 9.
>> >>>>>> mb
>> >>>>>> ox/%3C08C40723-8D4D-48EB-942B-8EE4327DD84A%40hortonworks.com%3E
and
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/20121
>> >>>>>> 0.
>> >>>>>> mb
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> ox/%3CCABN7xTCRM5wXGgJKEko0PmqDXhuAYpK%2BD-H57T29zcSGhkwGQw%40mail.gma
>> >>>>>> il
>> >>>>>> .c
>> >>>>>> om%3E ).  To help both communities understand what HCatalog
is and
>> >>>>>> hopes
>> >>>>>> to become we also developed a roadmap that summarizes HCatalog's
>> >>>>>> current
>> >>>>>> features, planned features, and other possible features
under
>> >>>>>> discussion:
>> >>>>>>
>> https://cwiki.apache.org/confluence/display/HCATALOG/HCatalog+Roadmap
>> >>>>>>
>> >>>>>> So we are now approaching you to see if there is agreement
in the
>> >>>>>> Hive
>> >>>>>> community that HCatalog graduating into Hive would make
sense.
>> >>>>>>
>> >>>>>> Alan.
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>>

Mime
View raw message