incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Nour El-Din <nour.moham...@gmail.com>
Subject Re: [VOTE] Accept Howl as an Incubator Project
Date Wed, 02 Mar 2011 00:05:43 GMT
1- If there are still some people in the initial committers list with
no filed ICLA then you should start the process [1].
2- Mentors should start initiating the podling process steps [2].

1- http://www.apache.org/dev/new-committers-guide.html
2- http://incubator.apache.org/guides/mentor.html

On Mon, Feb 28, 2011 at 7:20 PM, Alan Gates <gates@yahoo-inc.com> wrote:
> With 9 binding +1 votes and no -1s, the vote passes.  I'll take the naming
> issue back to the Howl contributors to discuss what we want to do about it.
>
> What's the next step now?
>
> Alan.
>
> On Feb 22, 2011, at 4:20 PM, Alan Gates wrote:
>
>> I would like to call a vote on accepting Howl as an Incubator
>> project.  The proposal is available at
>> http://wiki.apache.org/incubator/HowlProposal
>> .  You can see the discussion from the proposal thread at
>> http://tinyurl.com/5w7y9p9
>> .
>>
>> Alan.
>>
>> ----------------------
>>
>> Abstract
>> Howl is a table and storage management service for data created using
>> Apache Hadoop.
>>
>>
>> Proposal
>> The vision of Howl is to provide table management and storage
>> management layers for Apache Hadoop. This includes:
>>
>>        • Providing a shared schema and data type mechanism.
>>        • Providing a table abstraction so that users need not be concerned
>> with where or how their data is stored.
>>        • Providing interoperability across data processing tools such as
>> Pig, Map Reduce, Streaming, and Hive.
>>
>> Background
>> Data processors using Apache Hadoop have a common need for table
>> management services. The goal of a table management service is to
>> track data that exists in a Hadoop grid and present that data to users
>> in a tabular format. Such a table management service needs to provide
>> a single input and output format to users so that individual users
>> need not be concerned with the storage formats that are chosen for
>> particular data sets. As part of having a single format, the data will
>> need to be described by one type of schema and have a single datatype
>> system.
>>
>> Additionally, users should be free to choose the best tools for their
>> use cases. The Hadoop project includes Map Reduce, Streaming, Pig, and
>> Hive, and additional tools exist such as Cascading. Each of these
>> tools has users who prefer it, and there are use cases best addressed
>> by each of these tools. Two users on the same grid who need to share
>> data should not be constrained to use the same tool but rather should
>> be free to choose the best tool for their use case. A table management
>> service that presents data in the same way to all of the tools can
>> alleviate this problem by providing interfaces to each of the data
>> processing tools.
>>
>> There are also a few other features a table management service should
>> provide, such as notification of when data arrives.
>>
>> A couple of developers at Yahoo! started the project. It is based on
>> the Hive MetaStore component. There is good amount of interest in such
>> a service expressed from Yahoo!, Facebook, LinkedIn, and, others. We
>> are therefore proposing to place Howl in the Apache incubator and to
>> build an open source community around it.
>>
>>
>> Rationale
>> There is a strong need for a table management service, especially for
>> large grids with petabytes of data, and where the data volume is
>> increasing by the day. Hadoop users need to find data to read and have
>> a place to store  their data. Currently users must understand the
>> location of data to read, the storage format, compression techniques
>> used, etc. To write data they need to understand where on HDFS their
>> data belongs, the best compression format to use, how their data
>> should be serialized, etc.
>>
>> Most users do not want to be concerned with these issues. They want
>> these managed for them.
>>
>> Having it as an Apache Open Source project will highly benefit Howl
>> from the point of view of getting a large community that currently
>> uses Hadoop and the other products built around Hadoop (like Pig,
>> Hive, etc.). Users of the Hadoop ecosystem can influence Howl’s
>> roadmap, and contribute to it. Looking at it in another way, we
>> believe having Howl as part of the Hadoop ecosystem will be a great
>> benefit to the current Hadoop/Pig/Hive community too.
>>
>>
>> Current Status
>>
>> Meritocracy
>> Our intent with this incubator proposal is to start building a diverse
>> developer community around Howl following the Apache meritocracy
>> model. We have wanted to make the project open source and encourage
>> contributors from multiple organizations from the start. We plan to
>> provide plenty of support to new developers and to quickly recruit
>> those who make solid contributions to committer status.
>>
>>
>> Community
>> Howl is currently being used by developers at Yahoo! and there has
>> been an expressed interest from LinkedIn and Facebook. Yahoo! also
>> plans to deploy the current version of Howl in production soon. We
>> hope to extend the user and developer base further in the future. The
>> current developers and users are all interested in building a solid
>> open source community around Howl.
>>
>> To work towards an open source community, we have started using the
>> GitHub issue tracker and mailing lists at Yahoo! for development
>> discussions within our group.
>>
>>
>> Core Developers
>> Howl is currently being developed by four engineers from Yahoo! -
>> Devaraj Das, Ashutosh Chauhan, Sushanth Sowmyan, and Mac Yang. All the
>> engineers have deep expertise in Hadoop and the Hadoop Ecosystem in
>> general.
>>
>>
>> Alignment
>> The ASF is a natural host for Howl given that it is already the home
>> of Hadoop, Pig, HBase, Cassandra, and other emerging cloud software
>> projects. Howl was designed to support Hadoop from the beginning in
>> order to solve data management challenges in Hadoop clusters. Howl
>> complements the existing Apache cloud computing projects by providing
>> a unified way to manage data.
>>
>>
>> Known Risks
>>
>> Orphaned Products
>> The core developers plan to work full time on the project. There is
>> very little risk of Howl getting orphaned since large companies like
>> Yahoo! are planning to deploy this in their production Hadoop
>> clusters. We believe we can build an active developer community around
>> Howl (companies like Facebook and LinkedIn have also expressed
>> interest).
>>
>>
>> Inexperience with Open Source
>> All of the core developers are active users and followers of open
>> source. Devaraj Das is an Apache Hadoop committer and Apache Hadoop
>> PMC member, and has experience with the Apache infrastructure and
>> development process. Ashutosh Chauhan is an Apache Pig committer and
>> Apache Pig PMC member. Sushanth Sowmyan and Mac Yang made
>> contributions to the Apache Hive and the Apache Chukwa projects.
>>
>>
>> Homogeneous Developers
>> The current core developers are all from Yahoo! However, we hope to
>> establish a developer community that includes contributors from
>> several corporations, and we are starting to work towards this with
>> Facebook and LinkedIn.
>>
>>
>> Reliance on Salaried Developers
>> Currently, the developers are paid to do work on Howl. However, once
>> the project has a community built around it, we expect to get
>> committers and developers from outside the current core developers.
>> Companies like Yahoo! are invested in Howl being a solution to the
>> data management problem in Hadoop clusters, and that is not likely to
>> change.
>>
>>
>> Relationships with Other Apache Products
>> Howl is going to be used by users of Hadoop, Pig, and Hive. See
>> section Initial Source below for more information about Howl's
>> relationship to Hive.
>>
>>
>> An Excessive Fascination with the Apache Brand
>> While we respect the reputation of the Apache brand and have no doubts
>> that it will attract contributors and users, our interest is primarily
>> to give Howl a solid home as an open source project following an
>> established development model. We have also given reasons in the
>> Rationale and Alignment sections.
>>
>>
>> Documentation
>> Information about Howl can be found at http://wiki.apache.org/pig/
>> Howl. The following sources may be useful to start with:
>>
>>        •
>> The GitHub site: https://github.com/yahoo/howl
>>
>>        •
>> The roadmap: http://wiki.apache.org/pig/HowlJournal
>>
>>
>> Initial Source
>> Howl has been under development since Summer 2010 by a team of
>> engineers in Yahoo!. It is currently hosted on GitHub under an Apache
>> license at https://github.com/yahoo/howl.
>>
>> The initial development of Howl has consisted of:
>>
>>        • maintaining a branch of the entire Hive codebase
>>        • getting Howl-related patches committed to Hive
>>        • developing Howl-specific plugins and wrappers to customize Hive
>> behavior
>> At runtime, Howl executes Hive code for metastore and CLI+DDL,
>> disabling anything related to Hadoop map/reduce execution. It also
>> makes use of the RCFile storage format contained in Hive.
>>
>> This approach was taken as a first step in order to validate the
>> required functionality and get a production version working. However,
>> in the long-term, maintaining a clone of Hive is undesirable. One
>> possible resolution is to factor the metastore+CLI+DDL components out
>> of Hive and move them into Howl (making Hive dependent on Howl).
>> Another possible resolution is to remove the copy of Hive from Howl
>> and do the build/release engineering necessary to make Howl depend on
>> Hive. As part of the incubation process, we plan to work towards
>> resolution of  these issues.
>>
>>
>> External Dependencies
>> The dependencies all have Apache compatible licenses.
>>
>>
>> Cryptography
>> Not applicable.
>>
>>
>> Required Resources
>>
>> Mailing Lists
>>        • howl-private for private PMC discussions (with moderated
>> subscriptions)
>>        • howl-dev
>>        • howl-commits
>>        • howl-user
>>
>> Subversion Directory
>> https://svn.apache.org/repos/asf/incubator/howl
>>
>>
>> Issue Tracking
>> JIRA Howl (HOWL)
>>
>>
>> Other Resources
>> The existing code already has unit tests, so we would like a Hudson
>> instance to run them whenever a new patch is submitted. This can be
>> added after project creation.
>>
>>
>> Initial Committers
>>        • Devaraj Das
>>        • Ashutosh Chauhan
>>        • Sushanth Sowmyan
>>        • Mac Yang
>>        • Paul Yang
>>        • Alan Gates
>> A CLA is already on file for Sushanth.
>>
>>
>> Affiliations
>>        • Devaraj Das (Yahoo!)
>>        • Ashutosh Chauhan (Yahoo!)
>>        • Sushanth Sowmyan (Yahoo!)
>>        • Mac Yang (Yahoo!)
>>        • Paul Yang (Facebook)
>>        • Alan Gates (Yahoo!)
>>
>> Sponsors
>>
>> Champion
>> Owen O’Malley
>>
>>
>> Nominated Mentors
>>        • Olga Natkovich (Pig PMC member and Apache VP for Pig)
>>        • Alan Gates (Pig PMC member)
>>        • John Sichi (Hive PMC member)
>>
>> Sponsoring Entity
>> We are requesting the Incubator to sponsor this project.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>



-- 
Thanks
- Mohammad Nour
  Author of (WebSphere Application Server Community Edition 2.0 User Guide)
  http://www.redbooks.ibm.com/abstracts/sg247585.html
- LinkedIn: http://www.linkedin.com/in/mnour
- Blog: http://tadabborat.blogspot.com
----
"Life is like riding a bicycle. To keep your balance you must keep moving"
- Albert Einstein

"Writing clean code is what you must do in order to call yourself a
professional. There is no reasonable excuse for doing anything less
than your best."
- Clean Code: A Handbook of Agile Software Craftsmanship

"Stay hungry, stay foolish."
- Steve Jobs

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message