cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joerg Heinicke <jheini...@virbus.de>
Subject Re: Double URLdecoding problem with request.getSitemapURI()
Date Sat, 22 Nov 2003 02:28:59 GMT
Please add this to bugzilla. Thanks for your effort for finding the 
origin of the bug.

Joerg

On 04.11.2003 16:41, Gunnar Brand wrote:

> Hello!
> 
> While creating an application that maps a complete path to a 
> resourcereader (to retrieve documents from a storage transparently), I 
> noticed a bug(?). Whenever the name/path of the file contained a '+' (of 
> course properly encoded as %2b), it didn't find the file. The storage 
> server echoes the looked up path/file and instead of the '+' there was a 
> ' ' (+ is a placeholder for ' ' ).
> 
> Since the %2b does work if I get parameters directly from the request, I 
> deduced that there must be some double url decoding going on. After some 
> investigation it was clear that the incorrect url was fed into the 
> reader and wildcard matcher so it had to happen a bit earlier already.
> 
> After a quick modification in the samples sitemap (adding ** in front of 
> the match) and the RequestGenerator, I could use any path I wanted. The 
> generator displayed not only the request.getRequestURI() but also the 
> request.getSitemapURI().
> 
> RequestGenerator.java:
> this.attribute(attr,"target", request.getRequestURI());
> this.attribute(attr,"sitemaptarget", request.getSitemapURI());   // <-- 
> added
> 
> With a url like
> http://rei:8080/samples/a%20test%20dir%20with%20a%20plus%20at%20the%20end%2B/request.html?test=%20%2bx+y

> 
> it prints (shortened a bit):
> 
> <h:request 
> target="/samples/a%20test%20dir%20with%20a%20plus%20at%20the%20end%2B/request.html" 
> 
>   sitemaptarget="a test dir with a plus at the end /request.html" 
> source="">
> <h:requestParameters>
>   <h:parameter name="test">
>     <h:value> +x y</h:value>
>   </h:parameter>
> </h:requestParameters>
> </h:request>
> 
> So it obviously the sitemap uri was decoded twice. The culprit seems to 
> be the CocoonServlet.java, so I added a small debug output (code below 
> is from cvs HEAD):
> 
>   public void service(HttpServletRequest req, HttpServletResponse res)
>     throws ServletException, IOException {
> 
>         // We got it... Process the request
>         String uri = request.getServletPath();
>     System.out.println("request.getServletPath():" + uri);  // added
>         if (uri == null) {
>             uri = "";
>         }
>         String pathInfo = request.getPathInfo();
> 
>         .....
> 
>         Environment env;
>         try{
>             if (uri.charAt(0) == '/') {
>                 uri = uri.substring(1);
>             }
>  >>> line 1087:
>             env = getEnvironment(URLDecoder.decode(uri), request, res);
>         } catch (Exception e) {
>         ...
> 
> The debug output is:
> request.getServletPath():/samples/a test dir with a plus at the 
> end+/request.html
> 
> So the request.getServletPath() method returns a "url" that is already 
> properly decoded and that is being decoded for the second time in line 
> 1087. This is true for both Jetty and Tomcat4.1.
> 
> Unfortunately a look into the Servlet API does not indicate if 
> getServletPath is supposed to return a decoded or still URLencoded string.
> 
> 
> getServletPath()
> public java.lang.String getServletPath()
> Returns the part of this request's URL that calls the servlet. This 
> includes
> either the servlet name or a path to the servlet, but does not include any
> extra path information or a query string.
> Same as the value of the CGI variable SCRIPT_NAME.
> 
> Returns: a String containing the name or path of the servlet being called,
>          as specified in the request URL
> 
> 
> The big question now is, is this a bug - or are there cases where this 
> method is returning encoded strings?
> (For me it does look like one and I need to remove it to get my 
> application working ;)
> 
> Gunnar.


Mime
View raw message