commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrian Sutton <adrian.sut...@ephox.com>
Subject RE: [httpclient] Parse response for NameValuePairs?
Date Sun, 06 Apr 2003 22:59:17 GMT
Hi John,
I never even thought of that meaning for name value pairs. :)

Essentially the way I'd take this, is to get a HTML parser (like say JTidy,
start at http://tidy.sf.net/ and I'm pretty sure there's a link there) and
use that.  This is outside of the scope of HttpClient since we wash our
hands of all processing once we get the response from the server for you.
That said, we are beginning a collection of useful utilities that are
outside of the scope of HttpClient but are commonly used with HttpClient and
this would fit into that description, so if you do wind up writing it and
were kind enough to donate it under the Apache license that would be greatly
appreciated.

Our application makes heavy use of JTidy for all kinds of wierd and
wonderful stuff related to HTML so I can probably help along those lines
even though I've never had to extract INPUT tag elements (yet). :)

I'd recommend you start with Tidy's parseDOM method which returns a
org.w3c.dom.Document object then iterate over each element in the tree
recursively looking for either any INPUT element or any INPUT element with a
TYPE="radio" attribute depending on your requirements then extract the name
and value from that element and store it somewhere for later processing.
One key thing to note is that JTidy is a nasty port of the C Tidy
implementation and does not support international characters properly
(particularly double byte characters).  There is a patch available somewhere
that fixes this though and as long as the name and value didn't contain
double byte characters it wouldn't matter that the rest of the HTML got
corrupted anyway.

A less robust but probably simpler and faster solution would be to just do
simple string parsing on the HTML, but you'd then have to worry about
whether or not the element was commented out, if it was inside the same form
you're talking about, if it was in a textarea or (my personal worst
nightmare) if the "HTML" was completely invalid (and believe me, you'd be
amazed and how bad HTML can be and still display in a browser correctly).
Tidy can deal with invalid HTML really well which is why I recommend using
it.

Hope that helps, let me know if there's anything else I can help with.

Adrian Sutton, Software Engineer
Ephox Corporation
www.ephox.com


-----Original Message-----
From: John Burke [mailto:johnburke@earthlink.net]
Sent: Saturday, 5 April 2003 9:44 AM
To: Jakarta Commons Users List
Subject: Re: [httpclient] Parse response for NameValuePairs?


Hello Adrian,
I'm using httpclient to automate the process of using an online
reservation system.  This is the first project I have used it for
so I'm am grasping at some new concepts.  At one point in the
script, I execute a get method and I get a few kilobytes from the
server.  Buried in there is a :  <INPUT TYPE="radio" NAME="xxx"
VALUE="yyy"> tag.  There may be more than one, but I have to respond
with my choice in the next POST method if I want to continue the
reservation process.  I thought the getmethod class might have a
method that culls the server response for these gems and returns
a set of NameValuePair class instances.  It would make building
the post method a little easier for the newbies, but I may just be lazy.
How would you handle this?
Thanks for your response.
John

On Thursday, Apr 3, 2003, at 22:48 America/New_York, Adrian Sutton 
wrote:

> Hi John,
> I take it you wanted to get the name and value of all the headers 
> returned
> in the response.  You can use the getResponseHeaders() method in any
> HttpMethod which will return an array of Headers.
>
> Each header though can contain multiple values so you'd have to 
> iterate over
> the headers and the over the values for each header.  Something like:
>
> Header[] headers = method.getResponseHeaders();
> for (int i = 0; i < headers.length; i++) {
> 	String headerName = headers[i].getName();
> 	HeaderElement[] elements = headers[i].getValues();
> 	for (int j = 0; j < elements.length; j++) {
> 		HeaderElement el = elements[j];
> 		// At this stage you have the header and it's value.
> 		// See below for information on some "funky" headers
> 	}
> }
>
> Some headers can contain multiple values within the header value, in
> particular cookie headers do this.  You can repeat the pattern above to
> iterate over the parameters of the HeaderElement to get a name value 
> pair of
> each element if that's what you want.  It really depends what level of
> detail you need to go to.
>
> Why were you wanting to do this?
>
> Adrian Sutton, Software Engineer
> Ephox Corporation
> www.ephox.com
>
>
> -----Original Message-----
> From: John Burke [mailto:johnburke@earthlink.net]
> Sent: Friday, 4 April 2003 1:17 PM
> To: commons-user@jakarta.apache.org
> Subject: [httpclient] Parse response for NameValuePairs?
>
>
> Hi, I've looked through the API docs but didn't find what I wanted.
> I'm wondering if there is a method that will find and return all
> NameValuePairs
> from a given response?  If not can anyone please suggest a few lines
> of code?  Thanks.
> John
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>
>
John Burke
Booz | Allen | Hamilton Inc.
(732) 935-5120
burke_john@bah.com


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message