lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: A way to download URLs and index better ?
Date Sun, 17 Jan 2010 02:37:40 GMT
Hello,

Use Droids, it's much simpler than Nutch or Heritrix:

http://incubator.apache.org/droids/

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: Phan The Dai <thienthanhomenh@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Sat, January 16, 2010 2:20:47 AM
> Subject: A way to download URLs and index better ?
> 
> Hi everyone, please help me this question:
> I need downloading some webpages from a list of URLs (about 200 links) and
> then index them by Lucene.
> This list is not fixed, because it depends on definition of my process.
> Currently, in my web application, I wrote class for downloading, but it
> download time is too long.
> 
> Please recommend me a Java library suitable with my situation for optimize
> downloading.
> More its examples are very wonderful (INPUT: list of URLs; OUTPUT: webpages
> content, or indexed repository)
> Thank you very much.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message