commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrian Sutton <>
Subject RE: [httpclient] URL encoding
Date Wed, 28 May 2003 06:03:14 GMT
The rational is that we're almost guaranteed to get the encoding wrong.
Essentially the encoding process normalizes the URL so that it can be
interpreted, which is why you have to encode the URL at all.  To illustrate
the point, try to correctly encode something like:


there's no way to tell if the ? and = signs are part of the path or the
query.  The same problem applies when there's only one ? because you'd
normally assume it should mark the start of the query but that would then
make it impossible to send a request to a path that contains an ? character.

There's probably a way to get around those problems and intelligently encode
but it was decided to keep it simple and obvious or we'd wind up spending
too much time trying to explain the odd behaviour.

We do of course, provide the URLUtils class (I think the name is right)
which will encode the URL for you.

Glad to hear you're not seeing the encoding problem anymore, do let us know
if it resurfaces so we can fix it - I never have managed to reproduce it


Adrian Sutton, Software Engineer
Ephox Corporation 

-----Original Message-----
From: Tracy Boehrer []
Sent: Wednesday, 28 May 2003 2:47 PM
To: Jakarta Commons Users List; Jakarta Commons Users List
Subject: RE: [httpclient] URL encoding

OK.  I just tried the source drop for 5/27 and it seems to work just fine.
Just to make sure I wasn't crazy, I retried the 5/22 drop and still had the
same problem.
Just out of curiosity, what is the rational for requiring the caller to
encode a URL (as opposed to HttpClient encoding it for them)?
Thanks for your help...

	-----Original Message----- 
	From: Adrian Sutton [] 
	Sent: Tue 5/27/2003 6:18 PM 
	To: 'Jakarta Commons Users List' 
	Subject: RE: [httpclient] URL encoding

	Hi Tracy, 
	There is something very strange going on here, I don't think you're
	the HttpClient you think you are.  When you first posted, I used
code almost 
	identical to what you're using to test it and had no problems.  I've
	tested again with what will become 2.0-beta1 when the release
manager gets 
	back and there is still no problem using the code you posted. 

	Where are you seeing the doubly escaped URL come out?  I'm looking
at the 
	wire trace to see exactly what is sent to the server and it's sent
to the 
	server exactly how I passed it into the GetMethod.  Can you provide
the wire 
	trace of your attempts?  Instructions for doing so can be found at 

	I must admit the getEscapedURI() and getEscapedQuery() calls look
	suspicious, but unless I can reproduce the problem I can't be sure
	that they should really be changed - there's some strange stuff that
	in the URI classes at times and I don't understand it all. :) 

	Adrian Sutton, Software Engineer 
	Ephox Corporation 

	-----Original Message----- 
	From: Tracy Boehrer [] 
	Sent: Tuesday, 27 May 2003 11:54 PM 
	To: Jakarta Commons Users List 
	Subject: RE: [httpclient] URL encoding 


	        import java.util.*; 

	        import org.apache.commons.httpclient.*; 
	        import org.apache.commons.httpclient.methods.*; 

	        public class MyGetContents 
	                public static void main(String args[]) throws 
	                                HttpClient client = new
	                                GetMethod gm = new GetMethod( 
	"http://mail2:80/exchange/tboehrer/Inbox/RE: Test Message.EML" ); 
	                                client.executeMethod( gm ); 
	                        catch( Exception e ) 
	                                System.out.println( e ); 

	Since I don't have the URL esaped, I get the following exception: 

	java.lang.IllegalArgumentException: Invalid uri 
	'http://mail2:80/exchange/tboehrer/Inbox/RE: Test Message.EML':
	absolute path not valid 

	If I escape the URL to be: 
then after 
	"parsedURI.getEscapedPath()" (below), the resultant URL is 

	Just so that I could move on, I changed "parsedURI.getEscapedPath()"
	"parsedURI.getPath()" and it seems to work fine. 

	-----Original Message----- 
	From: Adrian Sutton [] 
	Sent: Sunday, May 25, 2003 5:55 PM 
	To: 'Jakarta Commons Users List' 
	Subject: RE: [httpclient] URL encoding 

	Hi Tracy, 
	HttpClient should take only fully encoded URIs so the behaviour you
	would be a bug, however I can't reproduce the problem.  Could you
	send through a simple test case showing the problem or at least the
	code you're using that experiences the problem? 

	Thanks in advance, 

	Adrian Sutton, Software Engineer 
	Ephox Corporation 

	-----Original Message----- 
	From: Tracy Boehrer [] 
	Sent: Saturday, 24 May 2003 12:04 AM 
	Subject: [httpclient] URL encoding 

	Using the nightly source drop for 5/22, I am a bit confused as to
what is 
	required of a URL when constructing a HttpMethodBase.  If I supply
	unescaped URL (that requires it), an exception is thrown.  If I
escape the 
	URL, then this class will escape it again at around line 277: 

	            // set the path, defaulting to root 
	                parsedURI.getPath() == null 
	                ? "/" 
	                : parsedURI.getEscapedPath() 

	To unsubscribe, e-mail: 
	For additional commands, e-mail: 

  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message