cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robin Green" <gree...@hotmail.com>
Subject Re: Cache issues
Date Wed, 10 Jan 2001 16:28:09 GMT
wbayever@closerlook.com wrote:
>I posted the message listed below to cocoon-dev and got no response.

I missed that. Probably I was on holiday. Still, it's better to post to 
cocoon-dev again so we can all look at it.

(This is actually a very important issue for caching in Cocoon 2 - it's been 
mentioned once before but maybe forgotten!)

>Please let me know if req.getRequestURI() and req.getQueryString() are used
>for a reason; or if you believe that my fix is legit.

The reason is we simply did not anticipate this use of Cocoon within JSP. 
Most users now use Cocoon with XSP for dynamic content.

>Also, I am concerned about cacheing in Cocoon 1.8.1.  Since the HTTP
>headers are now included in the "cache string", I noticed that (at least on
>my set up) there is a JSESSIONID cookie being sent by the browser.

That indicates you're using sessions.

>Here is
>the string:
>
>+++++++++
>Mozilla/4.0 (compatible; MSIE 5.5; Windows NT
>5.0):GET:http://localhost:8080/Cocoon/Events.xml?2001&headers:Accept=image/
>gif, image/x-xbitmap, image/jpeg, image/pjpeg,
>application/vnd.ms-powerpoint, application/vnd.ms-excel,
>application/msword, application/x-comet, application/pdf,
>*/*Accept-Language=en-usAccept-Encoding=gzip, deflateUser-Agent=Mozilla/4.0
>(compatible; MSIE 5.5; Windows NT
>5.0)Host=localhost:8080Connection=Keep-AliveCookie=JSESSIONID=d0AmVhThQMVw8
>CKYwB
>+++++++++
>
>If this is true, every user which hits a given page will require the page
>to be regenerated.  I know that the fix was made to enable cookies and HTTP
>headers as parameters in XSL.  But, doesn't it defeat the purpose of a
>cache if no user can reuse a page generated for another user?

Yes, which is why I have implemented a header exclusion policy on the copy 
of Cocoon on my machine. However, I'm not convinced that the COOKIE header 
should be excluded. Think about it - if you are using sessions, you should 
be using them for a reason, and you probably don't want Jim's personal info 
to be served up to Joe.

But that applies to dynamic pages. I'm not familiar with cookies, but could 
someone enlighten us - are cookies sent for all pages if set for that site, 
even static pages, or does the server have to explicitly request a cookie? 
If the latter we might have to exclude cookies from caching by default.

>  IMHO, this
>should be a settable property in cocoon.properties.

I think actually it may depend on the page, which makes it even worse. If it 
is disabled by default, a workaround for dynamic session-aware pages would 
be just to avoid caching them - which is what most people do anyway, since 
it's the default!

>
>Thanks for your time,
>Wayne.
>
>
>Wayne Bayever
>01/02/2001 06:56 PM
>
>To:  cocoon-dev@xml.apache.org
>cc:
>
>
>Subject:  Cache fix for SSI and jsp:include
>
>Hi,
>
>I am working on a project that needs to include the output of an XML file
>inside a JSP.  In the JSP, I get the current date and pass it in as a
>parameter to XSL.  Here is a sample of the code:
>-----
><%
>GregorianCalendar c = new GregorianCalendar();
>int y = c.get(Calendar.YEAR);
>%>
>Event listings:
><jsp:include page="Events.xml" flush="true">
>      <jsp:param name="YR" value='<%=y%>'/>
></jsp:include>
>-----
>
>The XSL will filter the events based on the year passed in as the parameter
>YR.
>
>In order to test the page I changed the code to the following:
>-----
><%
>GregorianCalendar c = new GregorianCalendar();
>int y = c.get(Calendar.MINUTE)%3 + 2000;
>out.print(y);
>%>
>Event listings:
><jsp:include page="Events.xml" flush="true">
>      <jsp:param name="YR" value='<%=y%>'/>
></jsp:include>
>-----
>
>The output from the XML should change every minute based on the "fake" year
>that is passed in.  However, when tested, the "out.print(y)" was printing
>the correct "fake" year, but the output from the XML file didn't change.
>If I "touched" the XML file and then refreshed the browser, I got the
>correct output.
>
>So, somehow the page was getting cached and not recognizing that the
>parameter that was passed in had changed.
>
>I traced this problem to the following code:
>-- org/apache/cocoon/Utils.java----------
>     public static final String encode(...) {
>         StringBuffer url = new StringBuffer();
>         if (agent) {
>             url.append(req.getHeader("user-Agent"));
>             url.append(':');
>         }
>           url.append(req.getMethod());
>           url.append(':');
>         url.append(req.getScheme());
>         url.append("://");
>         url.append(req.getServerName());
>         url.append(':');
>         url.append(req.getServerPort());
>----->  url.append(req.getRequestURI());
>----->  if (query) {
>----->      url.append('?');
>----->      url.append(req.getQueryString());
>----->  }
>         return url.toString();
>     }
>--------------------
>When the XML page is included inside a JSP or via Apache's SSI,
>req.getRequestURI() and req.getQueryString() return the URI and QueryString
>of the "including" JSP or SHTML page NOT the included XML page.
>Since the "including" page URI and QueryString does not change from one
>minute to the next, Cocoon was retrieving the old page from the cache.

Now, the servlet specification is not the world's best example of clarity 
and exactitude, I have to admit, but even so it seems to me that it is a 
rather large bug in those JSP and SSI implementations if they hand Cocoon a 
ServletRequest object that returns contradictory information depending on 
whether you ask for parameters or the query string. I think this needs 
clarifying before we go any further.

>
>(About a year ago someone else posted a similar problem, where he was
>including 2 XML pages in one SHTML page and Cocoon was returning the same
>page twice.  This can be traced to the same problem)
>
>Here is the solution that I implemented.  I stole the code from two places.
>1.  To correct the URI problem, the following code could be used (stolen
>from Utils.getBaseName()):
>-----
>         //  Get the path of the real file
>         String path = (String)
>req.getAttribute("javax.servlet.include.servlet_path");
>         // otherwise, we find it out ourselves
>         if (path == null)
>             path = req.getServletPath();

Now see here is another contradictory piece of information. Is it documented 
anywhere which parts of the API refer to the subrequest and which refer to 
the parent request? You might find, hypothetically, that this only works on 
one or two particular implementations and breaks lots of other servlet 
engines - and that is not acceptable.

>
>         url.append(path);
>-----
>2. To correct the QueryString problem, the following code could be
>used(stolen from XSLTProcessor.filterParameters()):
>-----
>         if (query) {
>             url.append('?');
>             url.append(buildQueryString(req));
>         }
>  .
>  .
>  .
>   public static String buildQueryString(HttpServletRequest request) {
>     StringBuffer queryString = new StringBuffer();
>     Enumeration parameters = request.getParameterNames();
>
>     if (parameters != null) {
>       while (parameters.hasMoreElements()) {
>         String name = (String) parameters.nextElement();
>         StringCharacterIterator iter = new StringCharacterIterator(name);
>         boolean valid_name = true;
>         char c = iter.first();
>
>         if (!(Character.isLetter(c) || c == '_')) {
>           valid_name = false;
>         } else {
>           c = iter.next();
>         }
>
>         while (valid_name && c != iter.DONE) {
>           if (!(Character.isLetterOrDigit(c) ||
>             c == '-' ||
>             c == '_' ||
>             c == '.')) {
>               valid_name = false;
>             } else {
>               c = iter.next();
>             }
>           }
>
>           if (valid_name) {();se;rOrDigit(c)
>             queryString.append(URLEncoder.encode(name)
>+ "=" + URLEncoder.encode(request.getParameter(name)) + "&");
>           }
>         }
>       }
>       if(queryString.length() > 0)
>       {
>         queryString.setLength(queryString.length()-1);
>       }
>       return queryString.toString();
>     }
>-----
>
>This fix solves the problem, since now the URI and QueryString of the XML
>file gets used in the key in the caching scheme.  Also, if the same XML
>file is included in multiple JSPs or SHTML files, the correct cached page
>will be served.
>
>Obviously, duplicating the code in XSLTProcessor.filterParameters() and
>Utils.buildQueryString() is not optimal, so there is more work to be done.

Yes - moreover there is no need to escape the parameters, unlike in 
XSLTProcessor.

>
>Please let me know if you have any comments.  Is this the correct approach?
>If so, can this change be included in the next version of Cocoon 1?
>If so, should I become an active developer and make the changes myself?

You have already made the changes on your copy. To make the changes in CVS 
you have to either get someone else to commit the changes, or become a 
committer, which you can only become when you have submitted a number of 
good patches (or we think you are a good coder for some other good reason).

But as I said, the API ambiguities need addressing first of all.
_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.


Mime
View raw message