incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julien Vermillard <jvermill...@gmail.com>
Subject Re: [VOTE] Accept Blur into the Apache Incubator
Date Sun, 22 Jul 2012 19:45:34 GMT
+1

On Sun, Jul 22, 2012 at 4:40 PM, Sajeevan Achuthan
<achuthan.sajeevan@gmail.com> wrote:
> +1
>
> On 22 July 2012 14:40, Doug Cutting <cutting@gmail.com> wrote:
>
>> +1
>>
>> Doug
>> On Jul 20, 2012 9:43 AM, "Aaron McCurry" <amccurry@gmail.com> wrote:
>>
>> > I would like to call a vote for accepting Blur for incubation in the
>> > Apache Incubator. The full proposal is available below.
>> >
>> > Please cast your vote:
>> >
>> > [ ] +1, bring Blur into Incubator
>> > [ ] +0, I don't care either way,
>> > [ ] -1, do not bring Blur into Incubator, because...
>> >
>> > This vote will be open for 72 hours and only votes from the Incubator
>> > PMC are binding.
>> >
>> > Thank you for your consideration!
>> >
>> > Aaron
>> >
>> > http://wiki.apache.org/incubator/BlurProposal
>> >
>> > = Blur Proposal =
>> >
>> > == Abstract ==
>> > Blur is a search platform capable of searching massive amounts of data
>> > in a cloud computing environment. Blur leverages several existing
>> > Apache projects, including Apache Lucene, Apache Hadoop, Apache
>> > !ZooKeeper and Apache Thrift.  Both bulk and near real time (NRT)
>> > updates are possible with Blur.  Bulk updates are accomplished using
>> > Hadoop Map/Reduce and NRT are performed through direct Thrift calls.
>> >
>> > == Proposal ==
>> > Blur is an open source search platform capable of querying massive
>> > amounts of data at incredible speeds. Rather than using the flat,
>> > document-like data model used by most search solutions, Blur allows
>> > you to build rich data models and search them in a semi-relational
>> > manner similar to joins while querying a relational database. Using
>> > Blur, you can get precise search results against terabytes of data at
>> > Google-like speeds.  Blur leverages multiple open source projects
>> > including Hadoop, Lucene, Thrift and !ZooKeeper to create an
>> > environment where structured data can be transformed into an index
>> > that runs on a Hadoop cluster.  Blur uses the power of Map/Reduce for
>> > bulk indexing into Blur.  Server failures are handled automatically by
>> > using !ZooKeeper for cluster state and HDFS for index storage.
>> >
>> > == Background ==
>> > Blur was created by Aaron !McCurry in 2010. Blur was developed to
>> > solve the challenges in dealing with searching huge quantities of data
>> > that the traditional RDBMS solutions could not cope with while still
>> > providing JOIN-like capabilities to query the data.  Several other
>> > open source projects have implemented aspects of this design including
>> > elasticsearch, Katta and Apache Solr.
>> >
>> > == Rationale ==
>> > There is a need for a distributed search capability within the Hadoop
>> > ecosystem. Currently, there are no other search solutions that
>> > natively leverage HDFS and the failover features of Hadoop in the same
>> > manner as the Blur project. The communities we expect to be most
>> > interested in such a project are government, health care, and other
>> > industries where scalability is a concern. We have made much progress
>> > in developing this project over the past 2 years and believe both the
>> > project and the interested communities would benefit from this work
>> > being openly available and having open development.  In future
>> > versions of Blur the API will more closely follow the API’s provided
>> > in Lucene so that systems that already use Lucene can more easily
>> > scale with Blur. Blur can be viewed as a query execution engine that
>> > Lucene based solutions can utilize when scale becomes an issue.
>> >
>> > == Initial Goals ==
>> > The initial goals of the project are:
>> >  * To migrate the Blur codebase, issue tracking and wiki from
>> > github.com and integrate the project with the ASF infrastructure.
>> >  * Add new committers to the project and grow the community in "The
>> Apache
>> > Way".
>> >
>> > == Current Status ==
>> >
>> > === Meritocracy ===
>> > Blur was initially developed by Aaron !McCurry in June 2010.  Since
>> > then Blur has continued to evolve with the support of a small
>> > development team at Near Infinity.  As a part of the Apache Software
>> > Foundation, the Apache Blur team intends to strongly encourage the
>> > community to help with and contribute to the project.  Apache Blur
>> > will actively seek potential committers and help them become familiar
>> > with the codebase.
>> >
>> > === Community ===
>> > A small community has developed around Blur and several project teams
>> > are currently using Blur for their big data search capability. The
>> > source code is currently available on GitHub and there is a dedicated
>> > website (blur.io) that provides an overview of the project. Blur has
>> > been shared with several members of the Apache community and has been
>> > presented at the Bay Area HUG (see
>> > http://www.meetup.com/hadoop/events/20109471/).
>> >
>> > === Core Developers ===
>> > The current developers are employed by Near Infinity Corporation, but
>> > we anticipate interest developing among other companies.
>> >
>> > === Alignment ===
>> > Blur is built on top of a number of Apache projects; Hadoop, Lucene,
>> > !ZooKeeper, and Thrift. It builds with Maven.  During the course of
>> > Blur development, a couple of patches have been committed back to the
>> > Lucene project, including LUCENE-2205 and LUCENE-2215.  Due to the
>> > strong relationship with the before mentioned Apache projects, the
>> > incubator is a good match for Blur.
>> >
>> > == Known Risks ==
>> >
>> > === Orphaned Products ===
>> > There is only a small risk of being orphaned. The customers that
>> > currently use Blur are committed to improving the codebase of the
>> > project due to its fulfilling needs not addressed by any other
>> > software. In addition, one customer is providing financial support to
>> > further develop Blur given its importance on mission-critical
>> > projects.
>> >
>> > === Inexperience with Open Source ===
>> > The codebase has been treated internally as an open source project
>> > since its beginning, and Near Infinity has extensive experience
>> > developing and releasing open source projects
>> > (http://www.nearinfinity.com/products/open_source). We do not
>> > anticipate difficulty in operating under the Apache Way.
>> >
>> > === Homogeneous Developers ===
>> > Current developers are all employed by Near Infinity but we are
>> > actively seeking contributors from different companies and would
>> > welcome their participation.
>> >
>> > === Reliance on Salaried Developers ===
>> > Blur was originally created by Aaron !McCurry as a personal project
>> > and he remains the primary contributor.  Currently, Aaron’s employer
>> > (Near Infinity) fully supports his continued participation with paid,
>> > dedicated time to work on Blur. All other current developers are paid
>> > by Near Infinity to work on Blur as well.
>> >
>> > === Relationships with Other Apache Products ===
>> > Blur dependencies:
>> >
>> >  * Apache Hadoop
>> >  * Apache Lucene
>> >  * Apache !ZooKeeper
>> >  * Apache Thrift
>> >  * Apache log4j
>> >
>> > === Apache Brand ===
>> > Our interest in releasing this code as an Apache project is due to its
>> > strong relationship with other Apache projects, i.e. Blur has
>> > dependencies on Hadoop, Lucene, !ZooKeeper, and Thrift and its
>> > uniqueness within the Hadoop ecosystem.
>> >
>> > == Documentation ==
>> > Current documentation can be found at http://blur.io and
>> > https://github.com/nearinfinity/blur.
>> >
>> > == Initial Source ==
>> > Blur has been in development since summer 2010. The core codebase
>> > consists of about ~29,000 (~10,000 if the generated RPC code is not
>> > included) lines of code mainly Java.
>> >
>> > == Source and Intellectual Property Submission Plan ==
>> > Blur core code, examples, documentation, and training materials will
>> > be submitted by Near Infinity Corporation.
>> >
>> > == External Dependencies ==
>> >  * concurrentlinkedhashmap - Apache 2.0 License -
>> > http://code.google.com/p/concurrentlinkedhashmap/
>> >
>> > == Cryptography ==
>> > none
>> >
>> > == Required Resources ==
>> >  * Mailing Lists
>> >    * blur-private
>> >    * blur-dev
>> >    * blur-commits
>> >    * blur-user
>> >  * Subversion Directory
>> >    * https://git-wip-us.apache.org/repos/asf/blur.git
>> >  * Issue Tracking
>> >    * JIRA
>> >  * Continuous Integration
>> >    * Jenkins
>> >  * Web
>> >    * http://incubator.apache.org/blur/wiki at http://wiki.apache.org
>> > or http://cwiki.apache.org
>> >
>> > == Initial Committers ==
>> >  * Aaron !McCurry (aaron.mccurry at nearinfinity dot com)
>> >  * Scott Leberknight (scott.leberknight at nearinfinity dot com)
>> >  * Ryan Gimmy (ryan.gimmy at nearinfinity dot com)
>> >  * Tim Williams (twilliams at apache dot org)
>> >  * Patrick Hunt (phunt at apache dot org)
>> >  * Doug Cutting (cutting at apache dot org)
>> >
>> > == Affiliations ==
>> >  * Aaron !McCurry, Near Infinity
>> >  * Scott Leberknight, Near Infinity
>> >  * Ryan Gimmy, Near Infinity
>> >  * Patrick Hunt, Cloudera
>> >  * Doug Cutting, Cloudera
>> >
>> > == Sponsors ==
>> >  * Champion: Patrick Hunt
>> >
>> > == Nominated Mentors ==
>> >  * Tim Williams  (twilliams at apache dot org)
>> >  * Doug Cutting (cutting at apache dot org)
>> >  * Patrick Hunt (phunt at apache dot org)
>> >
>> > == Sponsoring Entity ==
>> >  * Apache Incubator
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> > For additional commands, e-mail: general-help@incubator.apache.org
>> >
>> >
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message