incubator-hcatalog-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Travis Crawford <>
Subject Re: [DISCUSS] Graduating from the Incubator
Date Mon, 01 Oct 2012 18:54:49 GMT
Hey all -

Thanks for getting this discussion started, Alan. I agree there's been
great progress towards graduation and its time to discuss next steps.

The biggest unknown is Hive team thoughts about making the MetaStore
and SerDe components reusable for other processing frameworks. This
will really help guide the conversation, and I think we need to
include the Hive folks early in this discussion.

>From the user perspective, HCatalog makes a ton of sense because they
can write jobs in Hive/Pig/MR/etc across all their data, and its all
in the same metadata service.

If the Hive folks are on board with this vision, I agree becoming a
Hive subproject makes sense, and we can help make it happen. Future
work might include stabilizing the serde/metastore public interfaces,
trimming dependencies in those components, and potentially refactoring
the "ql" package to use an hcatalog-hive-adapter like the other
processing frameworks do.

Thoughts about getting the Hive team's thoughts before a graduation
vote? We definitely want their input before potentially voting to
become a subproject.


On Sat, Sep 29, 2012 at 12:21 PM, Alan Gates <> wrote:
> Every 3 months HCatalog reports its progress to the Apache Incubator PMC.  Reviewing
both of the last two reports, members of the IPMC have commented that HCatalog appears to
be ready for graduation.  So, I want to start the conversation on graduating HCatalog.  I
agree with the IPMC reviewers that we are ready to graduate.  We have active committers from
three separate organizations (Hortonworks, Twitter, Yahoo).  We have made two releases demonstrating
our ability to push out software.  We have 3 ASF members, so we should be suitably versed
in the Apache way.
> The big question for HCatalog is graduate to where.  Since HCatalog is tightly integrated
with Hive one path that makes sense is for HCatalog to become a Hive subproject.  When HCatalog
first started the Hive community expressed interest in making it a subproject of Hive (see
below for more on why that did not happen).  I do not believe the Hive community is going
to move its metastore and serde code into a separate HCatalog project and depend on it in
the near future.  Given that I think it makes sense for us to explore becoming a part of Hive.
 The other option is for HCatalog to become a top level project (TLP) that continues to depend
heavily on Hive.
> More information for those with patience and time to read long emails.  I have had some
side conversations with others in the community about graduating already, and they asked a
few questions I thought it would be good to include the answers to here.  I also wanted to
go over the general graduation process so people know what's ahead.  Finally I'll give a little
history of the project.
> Questions:
> Q. As a Hive subproject do we have to move into their source tree,
> coordinate releases, etc? Basically – how would this affect the source
> tree?
> A. We would move into their source tree, but above trunk and branches, so that we would
still have our own tree.  Subprojects release separately of the main project.  Their releases
are approved by the PMC of the project.  I do not believe we would need to change our package
structure (i.e. we could still be org.apache.hcatalog; at least pig never moved away from
> Q. As a Hive subproject would committership be affected? Would
> Hive/HCat committers be merged so everyone can commit in both places?
> Or stay same as today?
> A. Subprojects have their own committer lists.  Generally these committers are not granted
karma in the main projects.  Main project committers likewise are not granted karma in subprojects.
 PMC members can commit anywhere.  I would propose that all active HCatalog committers would
become subproject committers, though we'd need to formally vote on that as part of our graduation.
 We would also want to negotiate with the Hive team to get some rights to commit in the metastore
and serde sections of the Hive code since we do a lot of work there.  Who becomes part of
the Hive PMC would need to be negotiated with the Hive team.
> Q. If HCatalog becomes a subproject of Hive could it still becomes its own TLP someday?
> A. Yes, subprojects can be spun out to be TLPs.  Pig went from the Incubator to being
a subproject of Hadoop to being a TLP.  Hive started as a subproject of Hadoop and went to
TLP.  We could also choose at some point in the future to absorb HCatalog fully into Hive
so it would be part of the main project.
> Mechanics:
> For a complete discussion of this see
 What follows is a summary with points pertinent to us.
> Step one is the discussion which this email starts.
> Step two will be a vote to determine if we want to graduate and where to.  We are going
to be pushed by the Incubator to graduate, so votes against graduation would need to be accompanied
by strong evidence and reasoning that we are not ready.  This vote will be open to all, but
only the votes of the PPMC (podling project managements committee) and mentors will be binding.
 You can find a list of PPMC members and mentors at
> If the result of step two is that we vote to graduate to TLP then the next step will
be draft a graduation resolution and submit it to the Incubator PMC.  Assuming they approve
it that resolution would then be presented to the board which would vote on whether to instantiate
us as a TLP.
> If the result of step two is that we vote to graduate into a subproject of Hive we will
need to begin discussions with the Hive PMC on whether they are interested in having us as
a subproject.  This would include determining a committership and PMC membership arrangement
that was acceptable to the Hive and HCatalog communities.  The HCatalog PPMC, the Hive PMC,
and the Incubator PMC would then all need to vote to approve the graduation.
> History:
> When we were first starting HCatalog some in the Hive community felt strongly that it
should start as a subproject of Hive.  Some in the Pig community were uncomfortable with this.
 As it turned out, the point was moot as Apache requires any code that comes from one or more
organizations to begin in the Incubator.  As HCat was a mixing of Hive code and Yahoo code
we interpreted that to mean it needed to be in the Incubator.  But we agreed to leave open
the question of whether HCat would be a TLP or a Hive subproject and revisit it at graduation.
> Alan.

View raw message