hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Tice" <rob.t...@k-int.com>
Subject RE: uri problems
Date Sat, 03 May 2003 12:20:20 GMT
Upgraded to the latest nightly build and the problem went away

Cheers

Rob



-----Original Message-----
From: Michael Becke [mailto:becke@u.washington.edu] 
Sent: 01 May 2003 18:43
To: Commons HttpClient Project
Subject: Re: uri problems

Rob,

My example did what you are saying, minus the execute part.  I tried 
again, but executed the method before calling getURI().  Same results 
though.  It seems to be working just fine.

Mike

Rob Tice wrote:
> Hi Mike
> 
> The example that you have tried doesn't actually show my problem
> (probably because I didn't explain it properly :)).
> 
> So
> 
> I can create the method and execute it fine. But when I subsequently
use
> a call to method.getURI() (after execution of the said method) it
fails
> with an 'escaped query not valid' exception.
> 
> Regards
> 
> 
> Rob
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Michael Becke [mailto:becke@u.washington.edu] 
> Sent: 01 May 2003 14:48
> To: Commons HttpClient Project
> Subject: Re: uri problems
> 
> Rob,
> 
> I tried to reproduce this error but was not successful.  Here's what I

> tried:
> 
>      String[] uris = {
>  
>
"http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M88
> 035&sid=5MJ*B70%200HN1&p=cd",
>  
>
"http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M88
> 035&sid=5MJ*B70%200HN1&pt=ca",
>      };
> 
>      for (int i = 0; i < uris.length; i++) {
>          GetMethod get = new GetMethod(uris[i]);
>          URI uri = new URI(uris[i].toCharArray());
> 
>          System.out.println(uri.getHost());
>          System.out.println(uri.getPath());
>          System.out.println(uri.getQuery());
>          System.out.println(uri.getURI());
>          System.out.println(uri.getEscapedURI());
>          System.out.println(get.getURI());
>      }
> 
> And I received the following output:
> 
> www.nhs.uk
> /localnhsservices/gp/return_gp_surgery.asp
> pid=5MJ*M88035&sid=5MJ*B70 0HN1&p=cd
>
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
> 35&sid=5MJ*B70 
> 0HN1&p=cd
>
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
> 35&sid=5MJ*B70%200HN1&p=cd
>
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
> 35&sid=5MJ*B70%25200HN1&p=cd
> www.nhs.uk
> /localnhsservices/gp/return_gp_surgery.asp
> pid=5MJ*M88035&sid=5MJ*B70 0HN1&pt=ca
>
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
> 35&sid=5MJ*B70 
> 0HN1&pt=ca
>
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
> 35&sid=5MJ*B70%200HN1&pt=ca
>
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
> 35&sid=5MJ*B70%25200HN1&pt=ca
> 
> Have you tried this with the latest nightly build of HttpClient?
> 
> On a related note, I'm not sure that the HttpMethodBase(String) 
> constructor is handling URIs correctly.  The Javadocs indicate that
the 
> given URI should already be escaped but the contructor uses the URI 
> contructor for unescaped URIs.
> 
> Mike
> 
> Rob Tice wrote:
> 
>>Hi Mike
>>
>>Anything like this causes the exception as shown
>>
>>
> 
>
http://www.nhs.uk/localnhsservices/gp/return_gp_surgery.asp?pid=5MJ*M880
> 
>>35&sid=5MJ*B70%200HN^1&p=cd
>>
>>
>>Cheers
>>
>>Rob
>>
>>
>>-----Original Message-----
>>From: Michael Becke [mailto:becke@u.washington.edu] 
>>Sent: 01 May 2003 13:19
>>To: Commons HttpClient Project
>>Subject: Re: uri problems
>>
>>Hi Rob,
>>
>>Sorry for the slow response.  Could you give some examples of valid  
>>URIs that do not work?
>>
>>Mike
>>
>>On Tuesday, April 29, 2003, at 07:00 PM, Rob Tice wrote:
>>
>>
>>
>>>Hi there
>>>
>>>
>>>
>>>I am using http client as the basis for analysis of a variety of web
>>>pages (and a vast number).
>>>
>>>
>>>
>>>I have come across several patterns which cause http client problems
.
>>>
>>>
>>>
>>>Many of the pages I am analysing have spaces or ‘^ ‘in the query part
>>
>>
>>>of
>>>the url. I have had to change the query bit set to reflect this as
>>>http-client was blowing up with the following.
>>>
>>>
>>>
>>>org.apache.commons.httpclient.URIException: escaped query not valid
>>>
>>>           at
>>>org.apache.commons.httpclient.URI.setRawQuery(URI.java:3201)
>>>
>>>           at
>>>org.apache.commons.httpclient.URI.setEscapedQuery(URI.java:3221)
>>>
>>>           at
>>>
>>
>>
>
org.apache.commons.httpclient.HttpMethodBase.getURI(HttpMethodBase.java 
> 
>>>:
>>>337)
>>>
>>>           at
>>>
>>
>>
>
com.k_int.OpenHarvest.robot.JHarvestRobot.processHttp(JHarvestRobot.jav 
> 
>>>a
>>>:408)
>>>
>>>           at
>>>
>>
>>
>
com.k_int.OpenHarvest.robot.JHarvestRobot.processNext(JHarvestRobot.jav 
> 
>>>a
>>>:108)
>>>
>>>           at
>>>com.k_int.OpenHarvest.robot.JHarvestRobot.run(JHarvestRobot.java:725)
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>This is the change I made
>>>
>>>
>>>
>>>   //protected static final BitSet query = uric; this was the code
>>>
>>>
>>>
>>>   protected static final BitSet query = new BitSet(256); // changed
>>>rob
>>>
>>>   static
>>>
>>>   {
>>>
>>>     query.or(uric);
>>>
>>>     query.set('^');
>>>
>>>     query.set(0x20);
>>>
>>>   }
>>>
>>>
>>>
>>>Over to you guys :-) what do you want to do?
>>>
>>>
>>>
>>>
>>>
>>>Regards
>>>
>>>
>>>
>>>Rob Tice
>>>
>>>Rob.tice@k-int.com
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail:
>>commons-httpclient-dev-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail:
>>commons-httpclient-dev-help@jakarta.apache.org
>>
>>
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail:
> 
> commons-httpclient-dev-unsubscribe@jakarta.apache.org
> 
>>For additional commands, e-mail:
> 
> commons-httpclient-dev-help@jakarta.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> commons-httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> commons-httpclient-dev-help@jakarta.apache.org
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
commons-httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
commons-httpclient-dev-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail:
commons-httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail:
commons-httpclient-dev-help@jakarta.apache.org





Mime
View raw message