incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexei Fedotov <alexei.fedo...@gmail.com>
Subject Re: [PROPOSAL] Gora to enter Incubator
Date Tue, 14 Sep 2010 08:16:55 GMT
This is how a person like me perceives the project. ORM... Mmm... The
acronym resembles CRM. Aha, this is CRM for stores which sell these
colum things! :-)

P.S. I've got the point. Later.



On Tue, Sep 14, 2010 at 11:33 AM, Tommaso Teofili
<tommaso.teofili@gmail.com> wrote:
> +1 (not binding)
> Tommaso
>
> 2010/9/13 Mohammad Nour El-Din <nour.mohammad@gmail.com>
>
>> +1 (Not binding)
>>
>> On Mon, Sep 13, 2010 at 4:46 PM, Mattmann, Chris A (388J)
>> <chris.a.mattmann@jpl.nasa.gov> wrote:
>> > My +1 to this proposal, but we certainly need at least one more mentor,
>> please, if you’re interested sign up.
>> >
>> > Thanks!
>> >
>> > Cheers,
>> > Chris
>> >
>> >
>> >
>> > On 9/13/10 6:10 AM, "Enis Soztutar" <enis.soz.nutch@gmail.com> wrote:
>> >
>> > Hi all,
>> >
>> > We would like to announce the Proposal for Gora, an ORM for Colum Stores,
>> > for the Apache Incubation. We believe that Gora can find a nice home at
>> > Apache.
>> >
>> > Wiki of the proposal can be found at
>> > http://wiki.apache.org/incubator/GoraProposal
>> >
>> > The proposal is as below.
>> >
>> >
>> > = Gora Proposal for Apache Incubation =
>> >
>> > == Abstract ==
>> > Gora is an ORM framework for column stores such as Apache HBase and
>> Apache
>> > Cassandra with a specific focus on Hadoop.
>> >
>> > == Proposal ==
>> > Although there are various excellent ORM frameworks for relational
>> > databases, data modeling in NoSQL data stores differ profoundly from
>> their
>> > relational cousins. Moreover, data-model agnostic frameworks such as JDO
>> are
>> > not sufficient for use cases, where one needs to use the full power of
>> the
>> > data models in column stores. Gora fills this gap by giving the user an
>> > easy-to-use ORM framework with data store specific mappings and built in
>> > Apache Hadoop support.
>> >
>> > The overall goal for Gora is to become the standard data representation
>> and
>> > persistence framework for big data. The roadmap of Gora can be grouped as
>> > follows.
>> >
>> >  * Data Persistence : Persisting objects to Column stores such as HBase,
>> > Cassandra, Hypertable; key-value stores such as Voldermort, Redis, etc;
>> SQL
>> > databases, such as MySQL, HSQLDB, flat files in local file system of
>> Hadoop
>> > HDFS.
>> >  * Data Access : An easy to use Java-friendly common API for accessing
>> the
>> > data regardless of its location.
>> >  * Indexing : Persisting objects to Lucene and Solr indexes,
>> > accessing/querying the data with Gora API.
>> >  * Analysis : Accesing the data and making analysis through adapters for
>> > Apache Pig, Apache Hive and Cascading
>> >  * MapReduce support : Out-of-the-box and extensive MapReduce (Apache
>> > Hadoop) support for data in the data store.
>> >
>> > == Background ==
>> > ORM stands for Object Relation Mapping. It is a technology which abstacts
>> > the persistency layer
>> > (mostly Relational Databases) so that plain domain level objects can be
>> > used, without the cumbersome effort to save/load the data to and from the
>> > database. Gora differs from current solutions in that:
>> >  * Gora is specially focussed at NoSQL data stores, but also has limited
>> > support for SQL databases
>> >  * The main use case for Gora is to access/analyze big data using Hadoop.
>> >  * Gora uses Avro for bean definition, not byte code enhancement or
>> > annotations
>> >  * Object-to-data store mappings are backend specific, so that full data
>> > model can be utilized.
>> >  * Gora is simple since it ignores complex SQL mappings
>> >  * Gora will support persistence, indexing and anaysis of data, using
>> Pig,
>> > Lucene, Hive, etc
>> >
>> > == Rationale ==
>> > ORM frameworks are nothing new. But with the explosion of data generated
>> in
>> > Terabytes and even Petabytes, NoSQL data stores are gaining
>> ever-increasing
>> > popularity. Coupled with limited support to already-proven Apache Hadoop
>> > support in current ORM frameworks, there was a need for a new project.
>> >
>> > Gora is currently hosted at Github. However, Gora has ties to ASF in many
>> > ways. As detailed in the proposal section, Gora will be a high level
>> client
>> > for many Apache projects and subprojects including Hadoop(common, hdfs,
>> and
>> > mapreduce), HBase, Cassandra, Avro, Lucene, Solr, Pig, and Hive. Gora
>> > already uses Hadoop, HBase, Cassandra and Avro. Moreover, Gora started
>> its
>> > life inside Apache Nutch project, and now Nutch trunk uses Gora as a
>> > library. Even more, the initial set of committers are all ASF members.
>> > Therefore, we think that Apache will be an excellent home for Gora.
>> >
>> > == Initial Goals ==
>> > Initial goals for Gora can be summarized as:
>> >  * Iron out the remaining issues with HBase, Cassandra and SQL support.
>> >  * Make the first release before the end of the year.
>> >  * Improve documentation
>> >  * Support for Cascading
>> >
>> > == Current Status ==
>> > === Meritocracy ===
>> > Current commit rights belong to the initial list of committers four of
>> who
>> > are also ASF members. All the developers have extensive experience with
>> > Apache projects. We honor the meritocracy policy of ASF foundation.
>> >
>> > === Community ===
>> > Gora’s community mostly overlap with that of Nutch, Hadoop, HBase, Avro
>> and
>> > Cassandra. We
>> > have a small community for now (5 initial committers, 18 people tracking
>> the
>> > project at Github), but have been piggybacking the Nutch community for a
>> > while. If Gora is accepted to Apache Incubator, we expect more traction.
>> > Moreover, with the increasing popularity of NoSQL databases, we expect
>> more
>> > users.
>> >
>> > === Core Developers ===
>> > Gora was started by the initial code base inside Apache Nutch by Doğacan
>> > Güney. Then Enis Söztutar has refactored and re-architected the project
>> out
>> > of Nutch. Later Julien Nioche, Andrzej Bialecki and Doğacan has ported
>> Nutch
>> > to use the newly formed project. Later, Sertan Alkan has joined. Doğacan
>> and
>> > Julien are Nutch PMC members, Andrzej is the Nutch PMC chair. Enis is an
>> > Apache Hadoop PMC member.
>> >
>> > === Alignment ===
>> > As discusssed in the second paragraph of Rationale Section, all of the
>> > current developers are Apache people, and four of them are PMC members,
>> > which shows that we have some experience with the Apache way. Moreover,
>> Gora
>> > is tightly related with lots of Apache projects, Nutch, Hadoop, HBase,
>> > Cassandra, Avro, Pig, Hive, Lucene to name a few. Gora has started its
>> life
>> > inside Nutch, and now nutch trunk uses Gora to persist web crawl data to
>> > HBase, Cassandra and MySQL, which means that Gora is a very critical
>> > component in Nutch.
>> >
>> > == Known Risks ==
>> > === Orphaned Products ===
>> > Most of the development depends on Enis and Doğacan for now. Both of them
>> > intent to continue Gora development. However, we also acknowledge that
>> more
>> > core developers are needed for the project to be truly successful. The
>> > general strategy to acquire more developers will be to acquire more
>> users,
>> > and encourage users to be active in the community and develop patches.
>> > Moreover, the next release of Nutch planned before the end of 2010 has
>> > extensive Gora support. We expect more interest from Nutch community, and
>> we
>> > will continue to announce Gora notifications at Hadoop,HBase and
>> Cassandra
>> > mailing lists.
>> >
>> > === Inexperience with Open Source ===
>> > We believe that all of the developers have extensive open source
>> experience.
>> > Four of the initial committers are apache members. The codebase is also
>> open
>> > source since April 2010. We also have some documentation, wiki pages,
>> issue
>> > tracker and dev mailing list.
>> >
>> > === Homogeneous Developers ===
>> > We have a semi-distributed development environment where Doğacan, Enis
>> and
>> > Sertan share the same office, but Andrzej and Julien are independent.
>> With
>> > the aim of acquiring more developers, we expect more heterogeneous
>> > development.
>> >
>> > === Reliance on Salaried Developers ===
>> > Gora development have been supported by [[ant.com]]  search engine as
>> > contract work. It is expected that this contract will continue in the
>> > future. However, even without sponsors, we are commited to continue on
>> Gora
>> > development, since we believe in the technology it brings and it’s vital
>> > role in Nutch, and our other closed sourced projects.
>> >
>> > === Relationships with Other Apache Products ===
>> > Gora will be tightly related to lots of Apache projects:
>> >
>> >  * Nutch : Apache nutch was to home to Gora’s initial code base. Now,
>> Nutch
>> > trunk uses Gora as a library. The next relase of Nutch, planned before
>> the
>> > end of 2010 will be using Gora’s first release.
>> >  * Hadoop : Gora has extensive support for Hadoop MapReduce Gora defines
>> all
>> > the necessary data structures for working with Hadoop .Data stored in
>> column
>> > oriented data stores can be analyzed  with Gora using Hadoop.
>> >  * Avro : Gora uses and extends Avro. Data beans in Gora are defined
>> using
>> > Avro schemas ,and compiled into Java code with the extended version of
>> the
>> > Avro compiler. Avro is also used in data serialization.
>> >  * HBase : Gora supports HBase as a persistency backend.
>> >  * Cassandra : Gora support Cassandra as a persistency backend.
>> >  * Lucene/Solr : Gora intends to support Lucene/Solr as a persistency and
>> > indexing backend.
>> >  * Pig : Gora intends to support Pig for data anaysis
>> >  * Hive :  Gora intends to support Hive for data analysis
>> >
>> > === An Excessive Fascination with the Apache Brand ===
>> > Gora is a natural fit for Apache due to it's current commiters and
>> depending
>> > projects.
>> >
>> > == Documentation ==
>> >  * The project is currently hosted at http://github.com/enis/gora/.
>> >  * Wiki pages can be found at http://wiki.github.com/enis/gora/.
>> >  * List of issues can be found at  http://github.com/enis/gora/issues/.
>> >  * Current web address: http://groups.google.com/group/gora-dev.
>> >  * Current email address: gora-dev@googlegroups.com.
>> >
>> > == Initial Source ==
>> > The initial source was developed as a patch to the Apache Nutch project.
>> But
>> > the storage abstraction layer was orthogonal to the web crawler, and we
>> > decided to extract it to a separate project with much wider goals. Thus
>> > Gora, as a project, was born. The initial code is developed by Enis and
>> > Dogacan with ant.com’s sponsorship.
>> >
>> > The code can be found at http://github.com/enis/gora/.
>> >
>> > == External Dependencies ==
>> > External dependencies excluding Apache projects are as follows
>> >  * JDOM - http://jdom.org/ -  Apache-style license
>> >  * SQL Builder - http://openhms.sourceforge.net/sqlbuilder/ - Artistic
>> > License, LGPL. SQL Builder is intended to be removed from the source due
>> to
>> > technical reasons anyway.
>> >  * HSQLDB - http://hsqldb.org/ - BSD-style license
>> >  * JUnit - http://junit.org - Common Public License 1.0
>> >  * SLF4J - http://www.slf4j.org/ - MIT License
>> >  * Google Guava Libraries - http://code.google.com/p/guava-libraries/ -
>> > Apache License 2.0
>> >
>> >
>> > == Required Resources ==
>> >
>> > === Mailing Lists ===
>> >
>> >  * gora-private (with moderated subscriptions)
>> >  * gora-dev
>> >  * gora-commits
>> >
>> > === Subversion Directory ===
>> >
>> >  * [[http://svn.apache.org/repos/asf/incubator/gora]]
>> >
>> > === Issue Tracking ===
>> >  * JIRA (GORA)
>> >
>> > === Other Resources ===
>> > We need a wiki at http://wiki.apache.org. Currently, we have a wiki at
>> > Github, Since there is not a lot of pages there, we can manually move the
>> > pages to the wiki at wiki.apache.org.
>> >
>> > == Initial Committers ==
>> >
>> > Name                   email
>> > Affiliation        Timezone
>> > Enis Söztutar       enis [at] apache.org           Konneka      
  +3
>> > Doğacan Güney  dogacan [at] apache.org    Konneka         +3
>> > Sertan Alkan       sertanalkan [at] gmail.com Konneka         +3
>> > Julien Nioche       jnioche [at] apache.org      DigitalPebble  +1
>> > Andrzej Bialecki   ab [at] apache.org             Sigram
>> >
>> >
>> > === Affiliations ===
>> > All of the parties are affiliated with open source consulting shops. Most
>> of
>> > the development was sponsored by ant.com, however we expect that the
>> amount
>> > of volunteer work will increase, and more developers will come on board.
>> >
>> > == Sponsors ==
>> >
>> > === Champion ===
>> >  * Chris Mattmann (mattmann AT apache DOT org)
>> >
>> > === Nominated Mentors ===
>> >  * Chris Mattmann (mattmann AT apache DOT org)
>> >  * Andrzej Bialecki (ab AT apache DOT org )
>> >
>> > === Sponsoring Entity ===
>> > Apache Incubator. Successful graduation can result in either being a TLP,
>> or
>> > a subproject of
>> > Hadoop, since most of the community is projected to overlap.
>> >
>> >
>> >
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> > Chris Mattmann, Ph.D.
>> > Senior Computer Scientist
>> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> > Office: 171-266B, Mailstop: 171-246
>> > Email: Chris.Mattmann@jpl.nasa.gov
>> > WWW:   http://sunset.usc.edu/~mattmann/
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> > Adjunct Assistant Professor, Computer Science Department
>> > University of Southern California, Los Angeles, CA 90089 USA
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >
>> >
>>
>>
>>
>> --
>> Thanks
>> - Mohammad Nour
>>   Author of (WebSphere Application Server Community Edition 2.0 User Guide)
>>   http://www.redbooks.ibm.com/abstracts/sg247585.html
>> - LinkedIn: http://www.linkedin.com/in/mnour
>> - Blog: http://tadabborat.blogspot.com
>> ----
>> "Life is like riding a bicycle. To keep your balance you must keep moving"
>> - Albert Einstein
>>
>> "Writing clean code is what you must do in order to call yourself a
>> professional. There is no reasonable excuse for doing anything less
>> than your best."
>> - Clean Code: A Handbook of Agile Software Craftsmanship
>>
>> "Stay hungry, stay foolish."
>> - Steve Jobs
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message