hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <seb...@gmail.com>
Subject Re: Getting FormElements from HTTP Response
Date Thu, 10 Aug 2006 12:42:28 GMT
But if you do really want/need to analyse html, then have a look at
the htmlparser project on sourceforge. There are other html parsers
e.g. Tidy.

S.
On 10/08/06, Ortwin Gl├╝ck <odi@odi.ch> wrote:
> Errol,
>
> No, there is no such functionality in the API. HttpClient is a transport
> library and does not look at the content. Whether its binary or HTML,
> HttpClient doesn't care.
>
> I also believe that "HTML screen scraping" is not the way to code
> interfaces between machines. Webservices were invented for this purpose.
>
> Cheers
>
> Ortwin
>
> Errol Dalgic wrote:
> > Hi there,
> >
> > I've written several bots to extract the form elements and values from HTML
> > pages utilising HttpClient and the java regex pattern/matcher api. I was
> > wondering whether there were any generic methods within the HTTPClient API
> > which would allow me to perform this task ie. retrieve all the name and
> > value pairs from the response?  I am sure this would be useful and happy to
> > share what I have written so far.
> >
> > Thanks,
> > Errol
> >
> >
> > Errol Dalgic
> > Programmer Analyst
> > ________________________________________
> > eSolutions - Consumer Operations Group
> > SingTel Optus Pty Ltd
> > errol.dalgic@optus.com.au
> >
> >
> >
> > The information contained in this e-mail message and any accompanying files
> > is or may be confidential. If you are not the intended recipient, any use,
> > dissemination, reliance, forwarding, printing or copying of this e-mail or
> > any attached files is unauthorised. This e-mail is subject to copyright. No
> > part of it should be reproduced, adapted or communicated without the written
> > consent of the copyright owner. If you have received this e-mail in error,
> > please advise the sender immediately by return e-mail, or telephone and
> > delete all copies. Optus does not guarantee the accuracy or completeness of
> > any information contained in this e-mail or attached files. Internet
> > communications are not secure, therefore Optus does not accept legal
> > responsibility for the contents of this message or attached files.
> >
> >
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> >
>
> --
> [web]  http://www.odi.ch/
> [blog] http://www.odi.ch/weblog/
> [pgp]  key 0x81CF3416
>        finger print F2B1 B21F F056 D53E 5D79 A5AF 02BE 70F5 81CF 3416
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


Mime
View raw message