tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rémy Maucherat <r...@apache.org>
Subject Re: Mime types
Date Wed, 31 Oct 2018 19:50:17 GMT
On Wed, Oct 31, 2018 at 8:16 PM Konstantin Kolinko <knst.kolinko@gmail.com>
wrote:

> ср, 31 окт. 2018 г. в 19:38, Rémy Maucherat <remm@apache.org>:
> >
> > Hi,
> >
> > There are two main contraptions in Tomcat that do (badly ...) extension
> to
> > mime type mapping: the shared web.xml and some hardcoded stuff in
> > startup.Tomcat.
> >
> > While we should obviously have support for user configured mime types in
> > web.xml, as it's the spec, there should be a possibility to use
> > Files.probeContentType as the fallback when a mime type isn't found (and
> > maybe also have an option to disable it ? - although I don't quite see
> why
> > it would bother anyone). After looking at its implementation, it looks
> into
> > all mime type locations we might want (the OS, a mime.types file, etc).
> The
> > only problem is that it uses a Path (that would be an issue since it's
> > super tied to a real filesystem), but thankfully it mostly uses toSting
> and
> > thus can be worked around using a new fake Path implementation.
> >
> > The code calling Files.probeContentType could be inserted here in
> > DefaultServlet:
> >         // Find content type.
> >         String contentType = resource.getMimeType();
> >         if (contentType == null) {
> >             contentType =
> > getServletContext().getMimeType(resource.getName());
> > --->
> >             resource.setMimeType(contentType);
> >         }
> >
> > And then all the badly maintained content from web.xml and the Tomcat
> class
> > can be deleted.
> >
> > Comments ?
>
> 1. "badly maintained content from web.xml"
>
> Do not call them "bad".
>

Ok, but they didn't look too good. My mime.types has a lot more types for
starters, and it's so big I don't feel like adding all that.

>
> AFAIK, Those are synchronized with httpd. IIRC there was python script
> to check the sync. Technically, it should be possible to sync with
> IANA registry.
>
> (I do not remember the details - those should be easy to find in the
> archives of this mailing list. I just remember that the last time that
> the sync was checked, there was some good job done to automate and
> perform the check.)
>
> Who maintains the mappings used Files.probeContentType and why do you
> think that those are maintained any better?
>

I remember we had a number of BZs asking to add or fix mime mappings.
Annoying.

>
> If an OS is an LTS one, are mime-mapping configurations in the OS
> updated as the time goes?
>

Well, ultimately the guy can still add its new mapping in web.xml. Just the
basic usual ones don't need to be there.

>
> It should be possible to write a JUnit test to keep the mappings in
> startup.Tomcat.DEFAULT_MIME_MAPPINGS in sync with the default web.xml
> file.
>
> 2. web.xml is portable between operating systems.
>
> I would expect surprises from Files.probeContentType()
>
> Looking at javadocs, FileTypeDetector is pluggable,
> and the default behaviour is os-dependent
>
>
> https://docs.oracle.com/javase/8/docs/api/java/nio/file/Files.html#probeContentType-java.nio.file.Path-
>
> https://docs.oracle.com/javase/8/docs/api/java/nio/file/spi/FileTypeDetector.html
>
> 3, My own story: I had to remove the default mime-type mapping for
> "gz" from conf/web.xml in my configurations:
>
> For filenames like "filename.foo.gz" it is "foo" part that determines
> the mime-type for me. This cannot be configured in web.xml. I use a
> filter (urlrewrite) to set content-type for requests to those files.
>
> Generally, configuring a Filter should have been enough. But there is
> a bug in the DefaultServlet that it does not respect the content-type
> that has already been set on the response and blindly overwrites it.
> Unless I remove the default mapping for "gz", the content-type value
> set by a filter is overwritten.
>
> Unless DefaultServlet behaviour is fixed, enabling probeContentType is
> likely to break my configurations.
>
> 4. I see a similarity to mod_mime_magic module of HTTPD.
>
> http://httpd.apache.org/docs/current/mod/mod_mime.html
> http://httpd.apache.org/docs/current/mod/mod_mime_magic.html
>
> (For some reason I though that mod_mime_magic uses the magic file from
> Unix OS.
> Actually it uses its own magic file from configuration of HTTPD,
> configured by directive "MimeMagicFile".
> So it is actually portable.)
>
> MimeMagicFile directive is disabled by default.
>
> http://svn.apache.org/viewvc/httpd/httpd/trunk/docs/conf/httpd.conf.in?view=markup#l330
>
> Performance =?
> The documentation of mod_mime_magic says that performance is a concern
> for this module.
>

No idea really, probeContentType checks a number of sources [on Fedora it
is: new GnomeFileTypeDetector(), new
MimeTypesFileTypeDetector(userMimeTypes), new
MimeTypesFileTypeDetector(etcMimeTypes), new MagicFileTypeDetector()], and
it is platform dependent as you say. The result is then cached into the
resource in Tomcat, so it should be "ok" performance wise. The question is:
besides the default servlet not respecting a possibly set content-type (I
have not looked at it, and IMO it is a separate issue, no problem if you'd
like to fix it), is the Files.probeContentType result good enough on
Windows ?

Given the JDK code, I think it is rather pointless to try to keep up, they
seem to do it better [on my Fedora].

Now if you really don't want it, I don't mind, just say it ;) The
refactorings I'm doing right now are absolutely not an employer request, it
should be obvious they're busy doing other stuff at the moment. So I'm just
looking at stuff and at the moment I'm busy "improving" embbeding.

Rémy

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message