tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Kolinko <knst.koli...@gmail.com>
Subject Re: Possible optimization for class loading?
Date Sun, 17 Nov 2019 16:21:06 GMT
вс, 17 нояб. 2019 г. в 18:40, Rainer Jung <rainer.jung@kippdata.de>:
>
> I noticed a slowness during Webapp startup on the Winows platform.
> Investigation showed, that about three seconds of the webapp startup
> time - roughly 1/3 of the total time - was due to
> File.getCanonicalPath(). That call seems to be especially expensive on
> Windows, because it iterates over each component of the path in question
> and does at least three system calls on each of them. So if you app is
> installed in a somewhat deeply nested path, getCanonicalPath() gets
> especially expensive.
>

Just as a reminder:
getCanonicalPath() is necessary to deal with file system on Windows
being case-insensitive. (Removing those calls is likely to lead to
exploitable security issues.) On case-sensitive filesystems one can
configure <Resources allowLinking="true"/> in Tomcat 8.5/9.0 and some
of those checks will be disabled.

http://tomcat.apache.org/tomcat-8.5-doc/config/resources.html

In Tomcat 7.0 the same configuration flag exists on Context,

http://tomcat.apache.org/tomcat-7.0-doc/config/context.html

> One can check this with a simple test app and running the ProcessMonitpr
> from SysInternals, which can nicely trace file system operations.
>
> I noticed that by far the most occurrences happened during webapp class
> loading. StandardRoot initializes resources to contain a DirResourceSet
> first and then a JarResourceSet per WEB-INF/lib jar file.
>
> Now any class loading (of not yet loaded classes) goes through the
> following stack (this is for TC 8.5):
>
> ...
> java.base@13.0.1/java.io.File.getCanonicalPath(File.java:618)
> org.apache.catalina.webresources.AbstractFileResourceSet.file(AbstractFileResourceSet.java:90)
> org.apache.catalina.webresources.DirResourceSet.getResource(DirResourceSet.java:101)
> org.apache.catalina.webresources.StandardRoot.getResourceInternal(StandardRoot.java:281)
> org.apache.catalina.webresources.Cache.getResource(Cache.java:62)
> org.apache.catalina.webresources.StandardRoot.getResource(StandardRoot.java:216)
> org.apache.catalina.webresources.StandardRoot.getClassLoaderResource(StandardRoot.java:225)
> org.apache.catalina.loader.WebappClassLoaderBase.findClassInternal(WebappClassLoaderBase.java:2279)
> org.apache.catalina.loader.WebappClassLoaderBase.findClass(WebappClassLoaderBase.java:857)
> org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1329)
> - locked (a java.lang.Object@1797c4d) index 11 frame
> org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1329)
> org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1182)
> ...
>
> In our case, the webapp does not have a WEB-INF/classes directory.
> Nevertheless TC always goes through the DirResourceSet running the
> expensive getCanonicalPath(), returning an EmptyResource and then
> looking for the class and finding it in one of the jar file resources.
> This happens about 5000 times (due to the number of classes to load).
>
> I know that we have to do this in general, because WEB-INF/classes have
> precedence over WEB-INF/lib, but maybe we could optimize in the case we
> are looking for a class but WEB-INF/classes does not even exist? Would
> this be a reasonable general optimization?

There are two scenarios:
(a) changes are expected: the web application classes can be updated
at runtime (with new classes being added),
(b) no changes: the web application classes are not updated at runtime.

(b) is certainly the case when the application runs in production mode
(when JSPs cannot be updated either)

(a) may be useful in some scenarios. Maybe in production if some
classes are added to a web application at run time as plugins.

I think that (b) is the case even when running in development mode, as
we are talking about classes, not JSPs here. IIRC Eclipse IDE restarts
a web application when its classes are updated and recompiled by the
IDE. I think that (a) is not used in development.
What do others think about it?
I remember some questions about hot-swapping of classes (for sake of
debugging), but Tomcat does not have such feature yet.

I think we can assume (b) by default, but have a flag to specify that
it is really (a).

If it is the (b) case, a single call to File.exists() /
File.isDirectory() is sufficient to remember "the classes directory
does not exist" and stop further lookups.

If if is the (a) case, a call to call File.exists() /
File.isDirectory() can be added as an optimization before calling
getCanonicalPath().

I think that we cannot remember "the classes directory does not exist"
fact without having a flag to disable such feature.


Looking into AbstractFileResourceSet.file() from the above stacktrace,
some check are already there, e.g. "mustExist" flag. I see that
DirResourceSet.getResource(DirResourceSet.java:101) calls file(...,
false), so "mustExist" is false. It is just a first glance, I have not
dug further yet.

Best regards,
Konstantin Kolinko

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Mime
View raw message