hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Tice" <rob.t...@k-int.com>
Subject RE: uri problems
Date Thu, 01 May 2003 17:07:05 GMT
Hi Mike

The example that you have tried doesn't actually show my problem
(probably because I didn't explain it properly :)).

So

I can create the method and execute it fine. But when I subsequently use
a call to method.getURI() (after execution of the said method) it fails
with an 'escaped query not valid' exception.

Regards


Rob





-----Original Message-----
From: Michael Becke [mailto:becke@u.washington.edu] 
Sent: 01 May 2003 14:48
To: Commons HttpClient Project
Subject: Re: uri problems

Rob,

I tried to reproduce this error but was not successful.  Here's what I 
tried:

     String[] uris = {
 
"http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M88
035&sid=5MJ*B70%200HN1&p=cd",
 
"http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M88
035&sid=5MJ*B70%200HN1&pt=ca",
     };

     for (int i = 0; i < uris.length; i++) {
         GetMethod get = new GetMethod(uris[i]);
         URI uri = new URI(uris[i].toCharArray());

         System.out.println(uri.getHost());
         System.out.println(uri.getPath());
         System.out.println(uri.getQuery());
         System.out.println(uri.getURI());
         System.out.println(uri.getEscapedURI());
         System.out.println(get.getURI());
     }

And I received the following output:

www.nhs.uk
/localnhsservices/gp/return_gp_surgery.asp
pid=5MJ*M88035&sid=5MJ*B70 0HN1&p=cd
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
35&sid=5MJ*B70 
0HN1&p=cd
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
35&sid=5MJ*B70%200HN1&p=cd
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
35&sid=5MJ*B70%25200HN1&p=cd
www.nhs.uk
/localnhsservices/gp/return_gp_surgery.asp
pid=5MJ*M88035&sid=5MJ*B70 0HN1&pt=ca
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
35&sid=5MJ*B70 
0HN1&pt=ca
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
35&sid=5MJ*B70%200HN1&pt=ca
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
35&sid=5MJ*B70%25200HN1&pt=ca

Have you tried this with the latest nightly build of HttpClient?

On a related note, I'm not sure that the HttpMethodBase(String) 
constructor is handling URIs correctly.  The Javadocs indicate that the 
given URI should already be escaped but the contructor uses the URI 
contructor for unescaped URIs.

Mike

Rob Tice wrote:
> Hi Mike
> 
> Anything like this causes the exception as shown
> 
>
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
> 35&sid=5MJ*B70%200HN^1&p=cd
> 
> 
> Cheers
> 
> Rob
> 
> 
> -----Original Message-----
> From: Michael Becke [mailto:becke@u.washington.edu] 
> Sent: 01 May 2003 13:19
> To: Commons HttpClient Project
> Subject: Re: uri problems
> 
> Hi Rob,
> 
> Sorry for the slow response.  Could you give some examples of valid  
> URIs that do not work?
> 
> Mike
> 
> On Tuesday, April 29, 2003, at 07:00 PM, Rob Tice wrote:
> 
> 
>>Hi there
>>
>>
>>
>>I am using http client as the basis for analysis of a variety of web
>>pages (and a vast number).
>>
>>
>>
>>I have come across several patterns which cause http client problems .
>>
>>
>>
>>Many of the pages I am analysing have spaces or ‘^ ‘in the query part
> 
> 
>>of
>>the url. I have had to change the query bit set to reflect this as
>>http-client was blowing up with the following.
>>
>>
>>
>>org.apache.commons.httpclient.URIException: escaped query not valid
>>
>>            at
>>org.apache.commons.httpclient.URI.setRawQuery(URI.java:3201)
>>
>>            at
>>org.apache.commons.httpclient.URI.setEscapedQuery(URI.java:3221)
>>
>>            at
>>
> 
>
org.apache.commons.httpclient.HttpMethodBase.getURI(HttpMethodBase.java 
> 
>>:
>>337)
>>
>>            at
>>
> 
>
com.k_int.OpenHarvest.robot.JHarvestRobot.processHttp(JHarvestRobot.jav 
> 
>>a
>>:408)
>>
>>            at
>>
> 
>
com.k_int.OpenHarvest.robot.JHarvestRobot.processNext(JHarvestRobot.jav 
> 
>>a
>>:108)
>>
>>            at
>>com.k_int.OpenHarvest.robot.JHarvestRobot.run(JHarvestRobot.java:725)
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>This is the change I made
>>
>>
>>
>>    //protected static final BitSet query = uric; this was the code
>>
>>
>>
>>    protected static final BitSet query = new BitSet(256); // changed
>>rob
>>
>>    static
>>
>>    {
>>
>>      query.or(uric);
>>
>>      query.set('^');
>>
>>      query.set(0x20);
>>
>>    }
>>
>>
>>
>>Over to you guys :-) what do you want to do?
>>
>>
>>
>>
>>
>>Regards
>>
>>
>>
>>Rob Tice
>>
>>Rob.tice@k-int.com
>>
>>
>>
>>
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> commons-httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> commons-httpclient-dev-help@jakarta.apache.org
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
commons-httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
commons-httpclient-dev-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail:
commons-httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail:
commons-httpclient-dev-help@jakarta.apache.org





Mime
View raw message