lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: a good html & script parser
Date Sun, 26 Sep 2004 22:24:51 GMT
I use NekoHTML and am happy with it.  I don't know about that
particular case.

Otis
http://www.simpy.com/ - full-text search your bookmarks

--- Chris Fraschetti <fraschetti@gmail.com> wrote:

> most of the html parsers I can find on the web handle only the <tag>
> syntax and forget about the { code } syntax that usually occurs in a
> lot of web pages.
> 
> Is there a good library to return the plain text of a html doc string
> which will eliminate more than simply the <tag> occurrance?
> 
> -- 
> ___________________________________________________
> Chris Fraschetti
> e fraschetti@gmail.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message