commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alessio Pace" <alessio.p...@gmail.com>
Subject Re: Use Digester against a htm file
Date Mon, 21 Aug 2006 07:46:06 GMT
Hi,
I recommend CyberNeko too.

You can use it as described here:
http://www.mail-archive.com/j-users@xerces.apache.org/msg00631.html

It's pretty simple.

Regards,
-- 
Alessio Pace.
http://www.jroller.com/page/alessiopace


On 8/7/06, Alfredo Ledezma Melendez <alfredo.melendez@mail.telcel.com>
wrote:
>
> Using a XML parser to process HTML info won't work (until the html is
> well-formed).
>
> To handle this info using Digester, first make your html a well-formed
> document (there are some java libraries to do this). Some time ago (I
> guess
> last week) the same topic was treated and some members recommended such
> tools.
>
> NekoParser
> http://java-source.net/open-source/html-parsers/nekohtml
>
>
> Regards,
> ____________________________________________
> Alfredo Ledezma Meléndez.
> Gerencia Implantación S.A.P.
> Supervisor Técnico WEB-ABAP
> Radiomóvil DIPSA, S. A. de C. V.
> Lago Alberto No. 366, Col. Anáhuac, C.P. 11320
> México D.F.
>
> > -----Original Message-----
> > From: Fabian Sergio de Rosa [mailto:fderosa@gmail.com]
> > Sent: Lunes, 07 de Agosto de 2006 03:41 p.m.
> > To: Jakarta Commons Users List
> > Subject: Re: Use Digester against a htm file
> >
> > i don't know if html is compatible with sax, and digester uses sax to
> > parse
> > xml. But if you try, you will know.
> > but i recomend that you try to use xml because the html format isn't
> > restrict and it's most oriented to show information.
> >
> > 2006/8/7, Marcos Hass W <marcoshass@gmail.com>:
> > >
> > > Hi all,
> > >
> > > I've been using digester for regular xml files and now I have a
> > different
> > > use case .., I need to feed a database from an .htm file.
> > > Is it possible to use digester against a .htm file ? I mean ... a file
> > > that
> > > doesn't have all tags closed, for example.:
> > >
> > > <li>
> > >     <ol>Item X
> > > </li>
> > > <p>This is the text I want to insert into a database
> > > <br>
> > >
> > >
> > > Thank you very much
> > > Marcos
> > >
> > >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message