commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alfredo Ledezma Melendez <alfredo.melen...@mail.telcel.com>
Subject RE: Use Digester against a htm file
Date Mon, 07 Aug 2006 21:03:29 GMT
Using a XML parser to process HTML info won't work (until the html is
well-formed).

To handle this info using Digester, first make your html a well-formed
document (there are some java libraries to do this). Some time ago (I guess
last week) the same topic was treated and some members recommended such
tools.

NekoParser
http://java-source.net/open-source/html-parsers/nekohtml


Regards,
____________________________________________
Alfredo Ledezma Meléndez.
Gerencia Implantación S.A.P.
Supervisor Técnico WEB-ABAP
Radiomóvil DIPSA, S. A. de C. V.
Lago Alberto No. 366, Col. Anáhuac, C.P. 11320
México D.F.

> -----Original Message-----
> From: Fabian Sergio de Rosa [mailto:fderosa@gmail.com]
> Sent: Lunes, 07 de Agosto de 2006 03:41 p.m.
> To: Jakarta Commons Users List
> Subject: Re: Use Digester against a htm file
> 
> i don't know if html is compatible with sax, and digester uses sax to
> parse
> xml. But if you try, you will know.
> but i recomend that you try to use xml because the html format isn't
> restrict and it's most oriented to show information.
> 
> 2006/8/7, Marcos Hass W <marcoshass@gmail.com>:
> >
> > Hi all,
> >
> > I've been using digester for regular xml files and now I have a
> different
> > use case .., I need to feed a database from an .htm file.
> > Is it possible to use digester against a .htm file ? I mean ... a file
> > that
> > doesn't have all tags closed, for example.:
> >
> > <li>
> >     <ol>Item X
> > </li>
> > <p>This is the text I want to insert into a database
> > <br>
> >
> >
> > Thank you very much
> > Marcos
> >
> >


Mime
View raw message