lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Halácsy Péter <halacsy.pe...@axelero.com>
Subject RE: HTML Analyzer & filter
Date Tue, 16 Apr 2002 15:29:50 GMT


> -----Original Message-----
> From: David Black [mailto:black@apple.com]
> Sent: Tuesday, April 16, 2002 5:07 PM
> To: lucene-user@jakarta.apache.org
> Subject: HTML Analyzer & filter 
> 
> 
> Not to seem too lazy but I was just beginning to write an HTML Filter 
> and Analyzer and thought..."gee, I bet someone has done this 
> already".  
> Are there any Apache/GPL HTML filters out there as a part of another 
> project or that anyone on this list would be willing to contribute.
> 
> Thanks
> 
> 


I'm afraid I don't understand what you really want. If you want to parse HTML files I suggest
that you should see javax.swing.text.html package. I used it to exctract text and some metadata
from HTML files.

peter

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message