lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Moylan <jo...@rte.ie>
Subject Re: HTMLDocument
Date Mon, 02 Feb 2004 09:41:34 GMT
Another easy HTML parser is HTMLparser.sf.net

John

On Sun, 2004-02-01 at 11:19, lucene@nitwit.de wrote:
> Hi!
> 
> Is there any HTMLDocument out there? The one in the demo package of lucene 
> does not handle non-wellformed HTML files (what about nekohtml?) and seems to 
> have some other inabilities and bugs as well (and why isn't it part of the 
> distro but in a demo package?!)?
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
-- 
John Moylan
----------------------
ePublishing
Radio Telefis Eireann,
Montrose House,
Donnybrook,
Dublin 4,
Eire
t:+353 1 2083564
e:john.moylan@rte.ie


******************************************************************************
The information in this e-mail is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this e-mail by anyone else
is unauthorised. If you are not the intended recipient, any disclosure,
copying, distribution, or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful.
Please note that emails to, from and within RTÉ may be subject to the Freedom
of Information Act 1997 and may be liable to disclosure.
******************************************************************************

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message