Return-Path: X-Original-To: apmail-tomcat-dev-archive@www.apache.org Delivered-To: apmail-tomcat-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 745451057C for ; Fri, 13 Sep 2013 16:06:21 +0000 (UTC) Received: (qmail 77740 invoked by uid 500); 13 Sep 2013 13:16:17 -0000 Delivered-To: apmail-tomcat-dev-archive@tomcat.apache.org Received: (qmail 77667 invoked by uid 500); 13 Sep 2013 13:16:16 -0000 Mailing-List: contact dev-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Developers List" Delivered-To: mailing list dev@tomcat.apache.org Received: (qmail 77582 invoked by uid 99); 13 Sep 2013 13:16:14 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Sep 2013 13:16:14 +0000 Received: from localhost (HELO [192.168.23.9]) (127.0.0.1) (smtp-auth username markt, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Sep 2013 13:16:13 +0000 Message-ID: <52331014.3010005@apache.org> Date: Fri, 13 Sep 2013 14:16:04 +0100 From: Mark Thomas User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: Tomcat Developers List Subject: Problem with o.a.c.webresources and packed WARs X-Enigmail-Version: 1.5.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit I started to look at Bug 52489 [1] which is requesting code signing support for WARs. I decided to work with an unpacked version of the examples application. During testing, I noticed that WEB-INF/classes and WEB-INF/lib were always being unpacked to the work directory. As well as looking to be unnecessary, always unpacking WEB-INF/classes and loading the classes from the work directory makes it possible to bypass any signature checks that are added to the WAR. Therefore, I removed the code that extracted WEB-INF/classes and added a TODO to do the same for WEB-INF/lib. [2] I wanted to be sure that there wasn't a good reason for extracting WEB-INF/classes so I did some more testing. It was at this point I discovered a much bigger issue. When packed, URLs to the JARs inside the WAR look like this: jar:file:/D:/repos/asf-public/tomcat/trunk/output/build/webapps/examples.war!/WEB-INF/lib/standard.jar If you obtain a JarFile from this URL, it is for examples.war rather than standard.jar. In short, jar URLs are not designed / intended / implemented to support access of JARs inside WARs (which are just JARs with a different name). This creates a number of issues internally within Tomcat as URLs are often used to refer to resources. I started to work-around this by implementing a third version of o.a.t.util.scan.Jar - FileUrlNestedJar. This overcame the first few obstacles but I then quickly reached the point where it was going to be necessary to modify o.a.jasper.compiler.JarResource. It would be necessary to remove all the current methods on that interface and replace them with something that worked with these 'nested' JARs - probably something that provided access to the input stream. Before starting on what looked to be a fair amount of work, I wanted to think about the problem more widely. At the back of my mind was the fact that I removed the Tomcat specific jndi url scheme that was previously used for accessing resources. The Javadoc for ServletContext#getResource() [3] is clear. The container must provide a URL for every resource and that URL must be usable. There is an immediate issue with resources located inside the META-INF/resources directory of a JAR which is itself located inside a WAR when unpacking WARs is disabled. There is no support from the JRE for accessing those resources with a URL. I see three possible solutions to this: a) Unpack JAR resources to the work dir and return a URL to the unpacked location b) Restore something like the jndi url scheme (renamed to tomcat to reduce the likelihood of clashes) c) Create a war url scheme only for accessing resources inside WARs a) Strikes me as a bit of a hack and one that requires a writeable work directory which might not always be available. b) Would address a concern Konstantin raised with the new webresources that once a URL had been obtained it bypassed the cache. On the down side it is likely to be more work than c) to implement. c) Would fix the issue at hand. In trying to decide between b) and c) I came to the conclusion that it would be better to keep the war:// url scheme and generic tomcat:// url schemes separated even if we went for option b). Therefore, I intend to go ahead with implementing c) and keep an eye on the users and dev list to see if a case for implementing b) emerges. Mark [1] https://issues.apache.org/bugzilla/show_bug.cgi?id=52489 [2] http://svn.apache.org/r1522704 [3] http://docs.oracle.com/javaee/7/api/javax/servlet/ServletContext.html#getResource(java.lang.String) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org For additional commands, e-mail: dev-help@tomcat.apache.org