tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Benussi" <>
Subject [OT Friday] Parse HTML file to underlying text
Date Sat, 03 Sep 2005 08:24:47 GMT
I know I missed the Friday deadline but...


Has anyone any recommendations for parsing html. I use Lucene and the
example has its own HTML parser but I was wondering if anyone has used an
existing project or whether there is some built in functionality in an
Apache lib to convert


<p>Hello <i>World</i></p>




Hello World


Your thoughts are appreciated.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message