hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <>
Subject Re: [DISCUSS] HCatalog becoming a subproject of Hive
Date Mon, 03 Dec 2012 23:22:16 GMT
I am not sure where we are on this discussion.  So far those who have chimed in seemed generally
positive (Namit, Edward, Clark, Alexander).  Namit and I have different visions for what the
committership might look like, so I'd like to hear from other Hive PMC members what their
view is on this.  I have to say from an HCatalog perspective the proposition is much less
attractive without some commit rights.

On a related note, people should be aware of these threads in the Incubator list:

For those not inclined to read all the mails in the threads I will summarize (though I urge
all PMC members of Hive and PPMC members of HCat to read both mail threads because this is
highly relevant to what we are discussing).  There are two salient points in these threads:

1) It is not wise to build a subproject that is distinct from the main project in the sense
that it has separate community members interested in it.  Bertrand, Arun, Chris Mattman, and
Greg Stein all spoke against this, and all are long time Apache contributors with a lot of
experience.  They were all of the opinion that it was reasonable for one project to release
separate products.

2) It is not wise to have committers that have access to parts of a project but not others.
 Greg and Bertrand argued (and Arun seemed to imply) that splitting up committer lists by
sections of the code did not work out well.

These insights cause me to question what we mean by subproject.  I had originally envisioned
something that looked like Pig and Hive did when they were subprojects of Hadoop.  But this
violates both 1 and 2 above.  Given this input from many of the "wise old timers" of Apache
I think we should consider what we mean when we say subproject and how tightly we are willing
to integrate these projects.  Personally I think it makes sense to continue to pursue integration,
as I think HCat is really a set of interfaces on top of Hive and it makes sense to coalesce
those into one project.  I guess this would mean HCat becomes just another set of jars that
Hive releases when it releases, rather than a stand alone entity.  But I'm curious to hear
what others think.  


On Nov 14, 2012, at 10:22 PM, Namit Jain wrote:

> The same criteria should be applied to all Hive committers. Only a
> committer should be able to commit code.
> I donĀ¹t think we should bend this rule. Metastore is not a separate
> project, but a integral part of hive.
> -namit
> On 11/12/12 10:32 PM, "Alan Gates" <> wrote:
>> I would suggest looking over the patch history of HCat committers.  I
>> think most of them have already contributed a number of patches to the
>> metastore.  All are certainly aware of how to run Hive unit tests and
>> have an understanding of how Hive works.  So I don't think it's fair to
>> say they would be unsafe with access to the metastore.  And the Hive PMC
>> is there to assure this does not happen.  If there are issues I am sure
>> they can deal with them.
>> Alan.
>> On Nov 6, 2012, at 8:06 PM, Namit Jain wrote:
>>> Alan, that would not be a good idea. Metastore code is part of hive
>>> code,
>>> and it
>>> would be safer if only Hive committers had commit access to that.
>>> On 11/6/12 11:25 PM, "Alan Gates" <> wrote:
>>>> On Nov 4, 2012, at 8:35 PM, Namit Jain wrote:
>>>>> I like the idea of Hcatalog becoming a Hive sub-project. The
>>>>> enhancements/bugs in the serde/metastore areas can indirectly
>>>>> benefit the hive community, and it will be easier for the fix to be in
>>>>> one
>>>>> place. Having said that, I don't see serde/metastore
>>>>> moving out of hive into a separate component. Things are tied too
>>>>> closely
>>>>> together. I am assuming that no new committers would
>>>>> be automatically added to Hive as part of this, and both Hive and
>>>>> HCatalog
>>>>> will continue to have its own committers.
>>>> One thing in this we'd like to discuss is the HCatalog committers
>>>> having
>>>> commit access to the metastore sections of Hive code.  That doesn't
>>>> mean
>>>> it has to move into HCatalog's code base.  But more and more the fixes
>>>> and changes we're doing in HCatalog are really in Hive's metastore.  So
>>>> we believe it would make sense to give HCat committers access to that
>>>> component as well as HCat.
>>>> Alan.
>>>>> Thanks,
>>>>> -namit
>>>>> On 11/3/12 2:22 AM, "Alan Gates" <> wrote:
>>>>>> Hello Hive community.  It is time for HCatalog to graduate from the
>>>>>> Apache Incubator.  Given the heavy dependence of HCatalog on Hive
>>>>>> HCatalog community agreed it made sense to explore graduating from
>>>>>> the
>>>>>> Incubator to become a subproject of Hive (see
>>>>>> 9.
>>>>>> mb
>>>>>> ox/ and
>>>>>> 0.
>>>>>> mb
>>>>>> ox/%3CCABN7xTCRM5wXGgJKEko0PmqDXhuAYpK%2BD-H57T29zcSGhkwGQw%40mail.gma
>>>>>> il
>>>>>> .c
>>>>>> om%3E ).  To help both communities understand what HCatalog is and
>>>>>> hopes
>>>>>> to become we also developed a roadmap that summarizes HCatalog's
>>>>>> current
>>>>>> features, planned features, and other possible features under
>>>>>> discussion:
>>>>>> So we are now approaching you to see if there is agreement in the
>>>>>> Hive
>>>>>> community that HCatalog graduating into Hive would make sense.
>>>>>> Alan.

View raw message