hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Becke <be...@u.washington.edu>
Subject Re: uri problems
Date Thu, 01 May 2003 12:19:26 GMT
Hi Rob,

Sorry for the slow response.  Could you give some examples of valid  
URIs that do not work?

Mike

On Tuesday, April 29, 2003, at 07:00 PM, Rob Tice wrote:

> Hi there
>
>
>
> I am using http client as the basis for analysis of a variety of web
> pages (and a vast number).
>
>
>
> I have come across several patterns which cause http client problems .
>
>
>
> Many of the pages I am analysing have spaces or ‘^ ‘in the query part  
> of
> the url. I have had to change the query bit set to reflect this as
> http-client was blowing up with the following.
>
>
>
> org.apache.commons.httpclient.URIException: escaped query not valid
>
>             at
> org.apache.commons.httpclient.URI.setRawQuery(URI.java:3201)
>
>             at
> org.apache.commons.httpclient.URI.setEscapedQuery(URI.java:3221)
>
>             at
> org.apache.commons.httpclient.HttpMethodBase.getURI(HttpMethodBase.java 
> :
> 337)
>
>             at
> com.k_int.OpenHarvest.robot.JHarvestRobot.processHttp(JHarvestRobot.jav 
> a
> :408)
>
>             at
> com.k_int.OpenHarvest.robot.JHarvestRobot.processNext(JHarvestRobot.jav 
> a
> :108)
>
>             at
> com.k_int.OpenHarvest.robot.JHarvestRobot.run(JHarvestRobot.java:725)
>
>
>
>
>
>
>
>
>
> This is the change I made
>
>
>
>     //protected static final BitSet query = uric; this was the code
>
>
>
>     protected static final BitSet query = new BitSet(256); // changed
> rob
>
>     static
>
>     {
>
>       query.or(uric);
>
>       query.set('^');
>
>       query.set(0x20);
>
>     }
>
>
>
> Over to you guys :-) what do you want to do?
>
>
>
>
>
> Regards
>
>
>
> Rob Tice
>
> Rob.tice@k-int.com
>
>
>
>
>


Mime
View raw message