xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davanum Srinivas <d...@yahoo.com>
Subject Re: Looking for tools/ideas for filtering HTML
Date Fri, 16 Nov 2001 21:22:21 GMT
Use JTidy - http://sourceforge.net/projects/jtidy/


--- "Jaquiss, Robert" <RJaquiss@nfb.org> wrote:
> Hello:
>      I have just joined this list, and am also a beginning Java
> programmer. I appologize if this is not a suitable question for this
> list. I need to write a filter for HTML pages. My goal is to read an
> HTML page, throwing away all the HTML code and just keeping a block of
> text that occurs near the bottom of the page. The HTML tags are liable
> to be unbalanced. There will be a <P> but no </P>. I found a sample
> program that used the SAXparser, but it SAXparser doesn't seem to handle
> unbalanced tags. Ideas/comments would be appreciated.  Thank you.
>     Regards
>    Robert Jaquiss

Davanum Srinivas - http://jguru.com/dims/

Do You Yahoo!?
Find the one for you at Yahoo! Personals

In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org

View raw message