httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier ...@ice-sa.com>
Subject Re: [users@httpd] Re: How to configure Apache 2 to compress xml files on serving?
Date Sat, 14 Jun 2008 11:59:38 GMT


Bo Berglund wrote:
> On Sat, 14 Jun 2008 09:32:26 +0200, André Warnier <aw@ice-sa.com>
> wrote:
> 
>>> HTTP/1.x 200 OK
>>> Date: Sat, 14 Jun 2008 06:33:12 GMT
>>> Server: Apache/2.0.53 (Fedora)
>>> Last-Modified: Thu, 12 Jun 2008 14:10:29 GMT
>>> Etag: "14fc-b9387f40"
>>> Accept-Ranges: bytes
>>> Content-Length: 5372
>>> Cache-Control: no-transform
>>> Keep-Alive: timeout=15, max=100
>>> Connection: Keep-Alive
>>> Content-Type: application/xml
>>> Content-Encoding: gzip
>>>
>>> ------------------ my test server  ------------------------
>>>
>>> HTTP/1.x 200 OK
>>> Date: Sat, 14 Jun 2008 06:34:38 GMT
>>> Server: Apache/2.0.54 (Win32) PHP/4.4.7
>>> Last-Modified: Thu, 12 Jun 2008 22:20:20 GMT
>>> Etag: "55084-14fc-91160693"
>>> Accept-Ranges: bytes
>>> Content-Length: 5372
>>> Keep-Alive: timeout=15, max=100
>>> Connection: Keep-Alive
>>> Content-Type: application/x-gzip
>>> Content-Encoding: gzip
>>> ------------------------------------------------------------------------------------
>>>
>>>
>>> In the server responses I see these differences:
>>>
>>> Cache-Control: no-transform  (not existing in test server)
>>> Content-Type: application/xml
>>>
>>> (test server has this instead:)
>>> Content-Type: application/x-gzip
>>>
>>> How is the tag "Content-Type" set in Apache?
>> Exactly.  Because in the second case, the browser gets 
>> "application/gzip" as the content-type, it thinks that what it has 
>> received is ok as is, and does not unzip it.
>> While in the first case, because it gets "application/xml", it "knows" 
>> that the content is really xml, and that it must unzip it first.
>>
>> So new we must find what, in the first server, sets the content-type 
>> that way.
>> One more question : on the first server, is the original file on disk 
>> already gzipped, or is it in xml (unzipped) format on the disk ?
>>
>> Since I don't have the configuration of the first server, I'm trying to 
>> guess what it exactly does before it sends out the response.  It could 
>> be taking an xml file, and gzipping it on-the-fly, before it sends it in 
>> the response.
>> Or else, it could be "cheating", taking the already gzipped file from 
>> disk, and sending it as is, but "falsifying the headers" to tell the 
>> browser to unzip it.
>> It may be as simple as adding (or replacing) some line
>> AddType application/xml .xml.gz
>>
> 
> I changed httpd.conf like this:
> 
> <Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
>     Options Indexes MultiViews Includes
>     AllowOverride None
>     Order allow,deny
>     Allow from all
>     AddType application/xml .xml.gz
>     AddEncoding gzip .gz
>     AddType text/xml .xml
>     AddType text/html .shtml
> </Directory>
> 
> 
> But FireFox still offers to save the file rather than decompressing
> and showing the xml like it does from the original server:
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 10:39:58 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:19:12 GMT
> Etag: "5b091-13b0-8d04e669"
> Accept-Ranges: bytes
> Content-Length: 5040
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: application/x-gzip
> Content-Encoding: gzip
> ----------------------------------------------------------
> 
> With this change:
> <Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
>     Options Indexes MultiViews Includes
>     AllowOverride None
>     Order allow,deny
>     Allow from all
>     AddType application/xml .xml.gz
>     AddType text/xml .xml
>     AddType text/html .shtml
> </Directory>
> 

Add the following directive to the above section :
  AddEncoding x-gzip .gz

and try again

> 
> I get this instead:
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 10:41:43 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:19:30 GMT
> Etag: "5b225-1277-8e1f670e"
> Accept-Ranges: bytes
> Content-Length: 4727
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: application/x-gzip
> ----------------------------------------------------------
> 
> With this in place I started looking elsewhere in httpd.conf and found
> this line, which I commented out:
> 
> AddType application/x-gzip .gz .tgz
> 
> 
> What happened now is that FireFox displays an error message:
> 
> XML Parsing Error: not well-formed
> Location: http://polaris/xmltv/svt1.svt.se_2008-06-15.xml.gz
> Line Number 1, Column 1
> 
> and the headers now are:
> 
> HTTP/1.x 200 OK
> Date: Sat, 14 Jun 2008 10:48:07 GMT
> Server: Apache/2.0.54 (Win32) PHP/4.4.7
> Last-Modified: Thu, 12 Jun 2008 22:18:36 GMT
> Etag: "5ae5a-169d-8aea1e6f"
> Accept-Ranges: bytes
> Content-Length: 5789
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive
> Content-Type: text/xml
> ----------------------------------------------------------
> 
> Probably now FireFox does not realize that the data are gzipped
> anymore and tries to parse the binary compressed stream, which
> obviously fails...

Yes.  Because the server tells Firefox that the document is "text/xml" 
and Firefox believes it.  That is the right thing to do for Firefox, 
according to the corresponding Internet RFC's.
(Unfortunately, that's not what IE does, but that is a whole separate 
story, in which I hope we don't have to get).

> Have to re-enable this directive...

No.  Leave this one commented out :
 > AddType application/x-gzip .gz .tgz

But add what I indicated above to your Directory section :
  AddEncoding x-gzip .gz

Note : I am also "fishing" to find the right settings.
But you have to do this systematically, without getting lost about what 
you add/remove, otherwise we will not know anymore.
The important part is what the server sends as headers with the HTTP 
response.
We must get to a situation where it sends :
Content-Type: text/xml  (or application/xml ?)
Content-Encoding: gzip  (or x-gzip ?)

So that Firefox knows that is is XML, but that it is gzipped.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Mime
View raw message