lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lucene@libero.it" <luc...@libero.it>
Subject Re:_HTML_parser
Date Sun, 21 Apr 2002 12:47:17 GMT
Hi Otis,

thanks for your reply. I have been looking for Spindle and Mojo for 2 
hours but I don't found anything.

Can you help me? Wher can I find something?

Thanks for your help and time


Laura


  

> Laura,
> 
> Search the lucene-user and lucene-dev archives for things like:
> crawler
> spider
> spindle
> lucene sandbox
> 
> Spindle is something you may want to look at, as is MoJo (not mentione
d
> on lucene lists, use Google).
> 
> Otis
> 
> > Did someone solve the problem to spider recursively a web pages?
> 
> > > >While trying to research the same thing, I found the
> > following...here
> > 's a 
> > > >good example of link extraction.....
> > > 
> > > Try http://www.quiotix.com/opensource/html-parser
> > > 
> > > Its easy to write a Visitor which extracts the links; should take
> > abou
> > t ten 
> > > lines of code.
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Games - play chess, backgammon, pool and more
> http://games.yahoo.com/
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-user-
unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-
help@jakarta.apache.org>
> 
> 
Mime
View raw message