cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <>
Subject Re: [proposal] fixing the encoding problems
Date Mon, 17 Mar 2003 20:13:45 GMT
Pier Fumagalli wrote:
> On 16/3/03 20:04, "Stefano Mazzocchi" <> wrote:
>>>So, if you to put encoding into sitemap... You will have to disable
>>>serializer configuration and request configuration and force sitemap
>>>encoding onto request / response. Is this what you are proposing?
>>please, read again, my proposal, i think it's pretty clear.
> Stefano, I believe your proposal got to the list chopped up big time,
> because what Vadim quoted is _ALL_ I've got as well, and really I don't
> understand what you want to do.

Uh, than sorry.

[big snip on well detailed encoding things]

> To rewrite what he said with the above mentioned three-layer encoding in
> mind:
> - the servlet container/mail engine/whatever will take care of the "Transfer
>   Encoding" (Cocoon as an application should not care nor interfere with
>   it).


> - ALL serializers should have the ability to deal with "Content Encoding",
>   unless (that would be my preferred option, as 90% of the times we think
>   about deploying things over servlets) we don't want to "recommend" the use
>   of "servlet filters" to do things such as GZIP encoding of the content.

In the past, I've been suggesting people to go down the servlet filter 
path, but I'm getting more and more to think that servlet filters are 
totally useless crap that can possibly work only for a few things and 
are overdesigned for what they can do.

So, I'm all in favor to provide internal alternatives.

You suggest to add a property to the serializer, but I think this is 
*NOT* a serializer's concer, but a higher level concern.

What about adding a 'content encoding' attribute to the 'pipeline' instead?

A pipeline provides a context of processing behavior. I think it fits 
perfectly with what we need and we don't even have to modify the 
serializers because all the stuff will be done by the pipeline engine 
that assembles the pipelines and creates the final response.

> - TEXT-based serializers should think about "charset encoding" and are the
>   only ones which should do that.


> So, in my opinion, the "best" way to tackle the charset-encoding problem is
> to have the org.apache.cocoon.serialization.AbstractTextSerializer to
> receive an OutputStream from its implementation of the
> SitemapOutputComponent interface, but to expose to its solid implementations
> another couple of methods, instead of "getOutputStream":
> - String getCharsetEncoding() [or getCharacterEncoding]:
>     Returns the default character encoding configured for the specified
>     AbstractTextSerializer (or the default one for the sitemap if none
>     was specified).
>     This can be usefult (for example) in the HtmlSerializer so that a new
>     <meta http-equiv="Content-Type" content="text/html; charset=???"/>
>     tag can be added automagically to the output, or to the "XMLSerializer"
>     so that the "<?xml version="1.0" encoding="???"?>" initial processing
>     instruction can be constructed appropriately.
> - Writer getWriter():
>     Returns a encoding character data to the response output
>     stream according to whatever is returned by getCharsetEncoding

Sounds good to me.

> Those two should be controlled from the sitemap by (as you, Stefano, said):
>>2) also, i want a way to overwrite the sitemap-wide behavior of every
>>single serializers, locally, such as
>> <map:serialize encoding="UTF-8"/>
> The only "nitpick" I have is that since "encoding" means a lot of things,
> this should be called "charset" (which is way more specific)...

very good point, I agree.

> This can be easily picked up by the AbstractTextSerializer.configure()
> method and returned by the two methods added above...


> I can work on a patch if you guys want... It's pretty trivial indeed...



View raw message