labs-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From thors...@apache.org
Subject svn commit: r607064 - in /labs/droids/trunk: LICENSE.txt README.TXT ivy.xml tools/
Date Thu, 27 Dec 2007 13:06:20 GMT
Author: thorsten
Date: Thu Dec 27 05:06:19 2007
New Revision: 607064

URL: http://svn.apache.org/viewvc?rev=607064&view=rev
Log:
Initial resources copied from old branch

Added:
    labs/droids/trunk/LICENSE.txt
      - copied unchanged from r604178, labs/droids/branch/nutch/LICENSE.txt
    labs/droids/trunk/README.TXT
      - copied, changed from r604178, labs/droids/branch/nutch/README.TXT
    labs/droids/trunk/ivy.xml
      - copied unchanged from r604178, labs/droids/branch/nutch/ivy.xml
    labs/droids/trunk/tools/
      - copied from r604178, labs/droids/branch/nutch/tools/

Copied: labs/droids/trunk/README.TXT (from r604178, labs/droids/branch/nutch/README.TXT)
URL: http://svn.apache.org/viewvc/labs/droids/trunk/README.TXT?p2=labs/droids/trunk/README.TXT&p1=labs/droids/branch/nutch/README.TXT&r1=604178&r2=607064&rev=607064&view=diff
==============================================================================
--- labs/droids/branch/nutch/README.TXT (original)
+++ labs/droids/trunk/README.TXT Thu Dec 27 05:06:19 2007
@@ -27,12 +27,12 @@
  Droids aims to be an intelligent standalone robot
  framework that allows to create robots as plugins, which can automatically seeks out
  relevant online information based on the user's specifications. For the core I took
- nutch, ripped out and modified the awesome plugin/extension framework. Droids makes
- it very easy to extend robots or write a new one. The fist implementation is
- crawler-x-m02y07 - a simple crawler which is easily extendable by plugins. If a
- project/app needs special processing for a crawled url one can write some plugins and
- use an existing crawler to implement the functionality or one can write a new crawler
- which is very easy.
+ formally nutch, ripped out and modified the awesome plugin/extension framework. 
+ 
+ Anyhow this version will not be based on this framework but using Spring instead. 
+ The main reason is that Spring has become a standard.
+ 
+ Droids makes it very easy to extend robots or write a new one. 
 
  Why was it created?
  -------------------
@@ -42,34 +42,11 @@
  anymore till we found a crawler replacement. Getting more involved in
  Solr and Nutch I see request for a generic standalone crawler. 
  
- How does the first implementation crawler-x-m02y07 looks like?
- --------------------------------------------------------------
- I wrote some proof of concept plugins that make up crawler-x-m02y07 to 
- - crawl an url 
- - extract links (only <a/> ATM) via a parse-html plugin
- - merge them with the queue
- - save or print out the crawled pages.
- 
- Why crawler-x-m02y07?
- ---------------------
- Droids tries to be a framework for different droids. 
- The first implementation is a "crawler" with the name "x"
- first archived in the second "m"onth of the "y"ear 20"07"
   
  Requirements
  ************
-* Apache Ant version 1.6.5
-** copy ./tools/ivy/i
+* Apache Ant version 1.7.0
 * JDK 1.5 or higher
-** If using JDK 1.5: 
-** cd lib/ 
-** wget http://www.ibiblio.org/maven2/stax/stax-api/1.0/stax-api-1.0.jar 
-
- Running
- *******
- Build: ant
- Initial url: echo "droids.initial.url=http://localhost/index.html">build.properties
- Run: ant crawl
  
  HEADSUP
  *******
@@ -77,8 +54,8 @@
  The parse-html plugin assumes that the incoming stream is valid xml!
  You will need to adjust the urlfilters to limit loops. 
  
- Links
- -----
+ Links / related projects
+ -------------------------
  http://lucene.apache.org/nutch/ - Nutch web-search software
  http://www.robotstxt.org/wc/robots.html - The Web Robots Pages
  http://www.few.vu.nl/~andreas/programming/webcrawler/index.html - How to write a multi-threaded
webcrawler



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@labs.apache.org
For additional commands, e-mail: commits-help@labs.apache.org


Mime
View raw message