Return-Path: Delivered-To: apmail-incubator-general-archive@www.apache.org Received: (qmail 19507 invoked from network); 3 Oct 2008 16:56:51 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Oct 2008 16:56:50 -0000 Received: (qmail 75498 invoked by uid 500); 3 Oct 2008 16:56:48 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 75310 invoked by uid 500); 3 Oct 2008 16:56:47 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 75299 invoked by uid 99); 3 Oct 2008 16:56:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Oct 2008 09:56:47 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=RCVD_IN_BL_SPAMCOP_NET,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [206.190.38.243] (HELO web50310.mail.re2.yahoo.com) (206.190.38.243) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 03 Oct 2008 16:55:44 +0000 Received: (qmail 33386 invoked by uid 60001); 3 Oct 2008 16:56:17 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Message-ID; b=4i14RIbUS7htRDI1uOPsERoZH0Uyk+nf4oO/z1oE+aD2VFzffL+FwTMKa8WzyK8RkYIwAk6iO8w8KIPk31fhSheNiG8wiAHk8rf1pBqgzLpvRK3/qHtkQ0+3KhgvB/P6EL6Rfg5Ny3Y97QlXKwKDqyuhQZSfHgJ7LAxFwPYbV3g=; X-YMail-OSG: yxVqXRUVM1mzsBGUmdIBTLXvjVpiQZkDvqY6cIYpmqqPKKj1Ygy_VQVuo2v5foLBUVKowipkGQuFM3qg4caZAGmMwKN_XabhBfTS2ZEUoqZ_WJ1lsLDjtoorgOexaEtPZl6b.V220JUAzeWXmkZtfSCNxQIwOGIjHSkAdgQPXZaEef7T Received: from [167.206.188.3] by web50310.mail.re2.yahoo.com via HTTP; Fri, 03 Oct 2008 09:56:16 PDT X-Mailer: YahooMailRC/1096.40 YahooMailWebService/0.7.218.2 Date: Fri, 3 Oct 2008 09:56:16 -0700 (PDT) From: Otis Gospodnetic Subject: Re: [Vote] accept Droids into incubation To: general@incubator.apache.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Message-ID: <76990.32854.qm@web50310.mail.re2.yahoo.com> X-Virus-Checked: Checked by ClamAV on apache.org +1 Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Thorsten Scherler > To: Incubator > Sent: Thursday, October 2, 2008 4:00:41 PM > Subject: [Vote] accept Droids into incubation > > Please vote on accepting Droids into incubation. > > The proposal can be found at: > http://wiki.apache.org/incubator/DroidsProposal > > The text of the proposal > > = Droids, an intelligent standalone robot framework = > > === Abstract === > > Droids aims to be an intelligent standalone robot framework that allows > to create and extend existing droids (robots). > > === Proposal === > > As a standalone robot framework Droids will offer infrastructure code to > create and extend existing robots. In the future it will offer as well a > web based administration application to manage and controll the > different droids which will communicate with this app. > > Droids makes it very easy to extend existing robots or write a new one > from scratch, which can automatically seek out relevant online > information based on the user's specifications. Since the flexible > design it can reuse directly all custom business logic that are written > in java. > > In the long run it should become umbrella for specialized droids that > are hosted as sub-projects. Where an ultimate goal is to integrate an > artificial intelligence that can control a swarm of droids and actively > plan/react on different tasks. > > === Background === > > The initial idea for the Droids project was voiced in February 2007 from > Thorsten Scherler mainly because of personal curiosity and developed as > a labs project. The background of his work was that Cocoon trunk (2.2) > did not provide a crawler anymore and Forrest was based on it, meaning > we could not update anymore till we found a crawler replacement. Getting > more involved in Solr and Nutch he saw the request for a generic > standalone crawler. > > For the first version he took nutch, ripped out and modified the > plugin/extension framework. However the second version were not based on > it anymore but was using Spring instead. The main reason was that Spring > has become a standard and helped to make Droids as extensible as > possible. > > Soon the first plugins and sample droids had been added to the code > based. > > === Rationale === > > There is ever more demand for tools that automatically do determinate > tasks. Search engines such as Nuts are normally very focused on a > specific functionality and are not focused on extensibility. Furthermore > there are manly focused on crawling, requesting certain pages and > extract links to other pages, which in our opinion is only one small > area for automated robots. While there are a number of existing crawler > libraries for various task, each of them comes with a custom API and > there are no generic interface for automatically determining which > crawler (droids) to use for a specific task. > > The Droids project attempts to remove this duplication of efforts. We > believe that by pooling the efforts of multiple projects we will be able > to create a generic robot framework that exceeds the capabilities and > quality of the custom solutions of any single project. The focus of > Droids is not a single crawler but more to offer different reusable > components that custom droids (robots) can use to automate certain > tasks. An intelligent standalone robot framework project will not only > provide common ground for the developers of crawler but as well for any > other automated application (robots) libraries. > > === Initial Goals === > > The initial goals of the proposed project are: > > * Viable community around the Droids codebase > * Active relationships and possible cooperation with related projects > and communities (e.g. reusing Tika for text extraction) > * Generic robot API for crawling, extracting structured text content > and/or new task, filtering task and handle the content > * Flexible extension and plugin development to create a wide range of > functionality > * Fuel develop of various droids and bring the current wget style > crawler to state-of-the-art level > > == Current Status == > > === Meritocracy === > > All the initial committers are familiar with the meritocracy principles > of Apache, and have already worked on the various source codebases. We > will follow the normal meritocracy rules also with other potential > contributors. > > === Community === > > There is not yet a clear Droids community. Instead we have a number of > people and related projects with an understanding that an intelligent > standalone robot framework project would best serve everyone's > interests. The primary goal of the incubating project is to build a > self-sustaining community around this shared vision. > > === Core Developers === > > The initial set of developers comes from various backgrounds, with > different but compatible needs for the proposed project. > > === Alignment === > > As a generic robot framework Droids will likely be widely used by > various open source and commercial projects both together with and > independent of other Apache tools. Apache projects like Cocoon, Lenya > and Forrest are potential candidates for using different droids as an > embedded component. > > == Known Risks == > > === Orphaned products === > > Till now only one company is known to use Droids in a productive > environment however there is a constant interest in a generic robot > framework expressed by various Apache committers. For many potential > users the existing tools are to complicated or too much focused on a > specific usecase which will help to gain a bigger user base. > > Once the project gets started we can quickly build the wget style droids > to a feature level of existing tools based on plugin development that > reuses code from sources mentioned below. After that we believe to be > able to quickly grow the developer and user communities based on the > benefits of a generic framework offering reusable plugins and different > droids over custom alternatives. > > === Inexperience with Open Source === > > All the initial developers have worked on open source before and many > are committers and PMC members within other Apache projects. > > === Homogenous Developers === > > The initial developers come from a variety of backgrounds and with a > variety of needs for the proposed toolkit. > > === Reliance on Salaried Developers === > > Some of the developers are paid to work develop certain functionality on > this, but the proposed project is not the primary task for anyone. > > === Relationships with Other Apache Products === > > TBN > > === A Excessive Fascination with the Apache Brand === > > All of us are familiar with Apache and we have participated in Apache > projects as contributors, committers, and PMC members. We feel that the > Apache Software Foundation is a natural home for a project like this. > > == Documentation == > > The main documentation is distributed with the code > > * [http://svn.apache.org/viewvc/labs/droids/trunk/docs/ Docu] > * [http://people.apache.org/~thorsten/droids/ DocuDeployed] > > == Initial Source == > > Droids will start with the code base that have been developed in the > Apache Labs project: > > * [http://svn.apache.org/viewvc/labs/droids/trunk/ code base] > > == Source and Intellectual Property Submission Plan == > > All seed code and other contributions will be handled through the normal > Apache contribution process. > > We will also contact other related efforts for possible cooperation and > contributions. > > == External Dependencies == > > Droids will mainly depend on the Spring core distribution. > > == Cryptography == > > Droids itself will not use cryptography, but it is possible that some of > the external libraries will include cryptographic code to handle > different features. > > == Required Resources == > > Mailing lists > > * droids-dev@incubator.apache.org > * droids-commits@incubator.apache.org > * droids-private@incubator.apache.org > > Subversion Directory > > * https://svn.apache.org/repos/asf/incubator/droids > > Issue Tracking > > * JIRA Droids (DROIDS) > > Other Resources > > * none > > == Initial Committers == > > || '''Name''' || '''Email''' || > '''CLA''' || > || Thorsten Scherler || thorsten at apache dot org || yes > || > || Ryan !McKinley || ryan at apache dot org || yes > || > || Grant Ingersoll || gsingers at apache dot org || > yes || > || Oleg Kalnichevski || olegk at apache dot org || > yes || > > == Affiliations == > > || '''Name''' || '''Affiliation''' > || > || Thorsten Scherler || Freelancer || > > > == Sponsors == > > Champion > > Grant Ingersoll > > Nominated Mentors > > * Ross Gardler > * Paul Fremantle > * Grant Ingersoll > > Sponsoring Entity > > * [http://hc.apache.org/ Apache HttpComponents] > * [http://lucene.apache.org/ Apache Lucene] > > -- > Thorsten Scherler thorsten.at.apache.org > Open Source Java consulting, training and solutions > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org > For additional commands, e-mail: general-help@incubator.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org For additional commands, e-mail: general-help@incubator.apache.org