Mailing-List: contact users-help@cocoon.apache.org; run by ezmlm
Precedence: bulk
Reply-To: users@cocoon.apache.org
Received-SPF: pass (athena.apache.org: domain of Rainer.Pruy@acrys.com
 designates 212.222.64.34 as permitted sender)
Message-ID: <47BD53DC.3090609@acrys.com>
Date: Thu, 21 Feb 2008 11:35:08 +0100
From: Rainer Pruy <Rainer.Pruy@Acrys.COM>
Organization: Acrys Consult GmbH & Co. KG
User-Agent: Thunderbird 2.0.0.9 (X11/20071114)
MIME-Version: 1.0
To: users@cocoon.apache.org
Subject: Re: C2.2: Accessing form via servlet services?
References: <479901F8.9070308@acrys.com> <47990FDC.7060605@apache.org>
 <479A08EE.2080300@acrys.com> <479B81D3.4060407@apache.org>
 <479CA825.8020906@acrys.com> <479CD740.1080906@apache.org>
 <479DC4DA.5040604@acrys.com> <47BC81A3.5060804@tuffmail.com>
In-Reply-To: <47BC81A3.5060804@tuffmail.com>
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 7bit

Hi Grzeg,
don't worry, my initial problem is long solved,
and it is important to prove from  time to time there is still life away from any keyboard....


Grzegorz Kossakowski schrieb:
> Hi Rainer. I'm very sorry that I have not been responding for so much time but there were lots of
> issues lately. The last one I got injured while skiing in Austria and I had no motivation to touch
> computers for some time...
> 
> Before I start to address your specific points I would like to say that I have spoken with Reinhard
> and he told me that he liked ideas posted in this thread and he told me he was going to take
> proposed approach in his next work on top of 2.2. That gave me more conviction that I'm on right
> track giving you advice.
> 
> I hope that Reinhard could speak about his thoughts on the list himself, though... :)
> 
> Rainer Pruy pisze:
>> Agreed, that would better reflect it's role in page composition.
>> However, a small but nevertheless important role is authentication and (global) authorization.
>> I have the firm conviction that it is bad design to rely on each subordinate block (the "A"s) to correctly implement the
>> authentication and authorization contract of the overall application. Especially if we allow for such blocks to be contributed by
>> "third parties". Not involved with putting up the application in the first place (e.g. customers).
> 
> I've been thinking about authentication and authorization as well. I agree it's bad design to rely
> on subordinate blocks implementing such important piece of functionality. However, I think it's also
> a bad design to mix content generation/aggregation with authentication and authorization. My idea is
> to create another sitemap (or maybe whole block) just handling authorization. It would exploit all
> the power that comes from sitemap matching so you could create a sophisticated patterns.
> 
> The idea is to make a request (having the same information as the request coming from browser) to
> your special block and let it to return only a status code:
>   a) 202 Accepted if request should be dispatcher to its target block
>   b) 401 Unauthorized if user is not authorized
>   c) 403 Forbidden in case when user does not have necessary karma
> 
> In cases b and c original request would never reach its target block. This functionality could be
> easily implemented in a DispatcherServlet of servlet service framework for example by overriding its
> service() method. Probably it would a good idea to refactor code in DispatcherServlet class so its
> easier for extending classes to plug such special handlers.

I came to a similar conclusion, too.
The "main entry" block could implement authentication and authorization and delegate "valid" calls to any further blocks.
Those blocks could then be put up following the idea of delegating "common" aspects to a parent block they derive from and
thus spread out the whole application logic to different blocks.

Such "main" block has the disadvantage of introducing an additional layer of block delegating calls. However, as of now the mechanisms
are already available. I did not consider modifying DispatcherServlet up to now.

What I still dislike with the approach of delegating control of complete layout to each (functional) block is that then
there is no guarantee that later extensions will comply with layout and structure of the page beyond their "own" part".
No one can insure a "foreign" crafted block will actually call all "common" parts from the parent.

In the end it would mean to provide a "caller" block that ensures application integration compliance for any "foreign" block.
I don't think that really allows for "dynamic" extendability.

Thus, the whole structure only will work for "trusted" environments.

> 
> I think it's quite obvious that this resembles the AOP functionality from Spring. I think such
> design would be very clear and would keep things like content authorization and content generation
> that are orthogonal separate.
> 
>> If there is no inversion of control, and passing (POST) requests resulting from submitting forms does not work transparently, the
>> there must be provisions with M to process forms from any subordinate block.
> 
> I don't get your point. Why do you want M to bother of POST request that A block should handle directly?

This is under the given pre-conditions. Thus, if A is (for some reason) not eligible for handling it's from directly and forwaring
form requests is not available either, the some other mechanism is needed.
> 
>> As it adds "single point of authentication/authorization" it is not redundant in the first place.
>> When just considering contribution to resulting page, you are right.
> 
> See above. M should not be involved in authentication/authorization because it's responsible for
> content generation only.
> 
>>> If you invert roles, then A block also uses delegation for generating common layout but this time
>>> there is only one procedure taking numerous parameters and the second, responsible for generation of
>>> common stuff, takes only a few of them or maybe even none. I see such situation as improvement and
>>> less complex code.
>> Yes, it will simplify code for M while adding a bit to each subordinate ("A").
> 
> Yep, but it's better to distribute the complexity among many blocks instead of creating one big
> beast that in the end must be tied to all of blocks because it must handle all edge cases.
> 
> Having one central point makes sense if the complexity can be reduced but here we are talking about
> only about complexity management and not reduction.

Yep, from a functional perspective I totally agree. However, with security (or "trust" in a broader sense - as above) having a central
point of control is important. And finally, it is a trade-off between added complexity and level of "trust".
And as such it is a decision to be validly left to the developer(s) of an application. (On the other side, if there is a solution that
has less complexity while providing same level of trust, than arguing against a solution and pointing to the "better" one is a must
from my point of view.)
> 
>>>> Responsibility of interactional behaviour for a certain region of the final page is delegated to block A (actually one of the forms
>>>> implemented there). Probably the use case is closer to portal cases than to normal "plain" interactions.
>>> Yes, this is about a portal case.
>> It will extend to it a bit more, if M would have areas, where more than one subordinate is to be displayed at a time with the
>> resulting page. In such case there would be more changes required to both blocks involved as only one of them can take over control.
>> (Imagine, both subordinates are going to show forms...)
> 
> Yep, but the data coming from browser is interesting for one block only at the time. It's the block
> whose form has been submitted. Other blocks should redisplay its content without any extra
> processing because there was no data to process, right?

This is true, as long as the blocks are independent.
But just reconsider my example: one block is deciding the activation/use of others (menu / dispatcher). Here the menu block might
change (to reflect the "selected" entry) and the "called" block has to provide it's content (to reflect its operation).
(This is not related to POSTing data, however. But I won't bet on there not being an equivalent case involving some POST in general.)
As long as the blocks that depend on input (browser call) all are reached by "super", such solution will work (My example does fit here)
Nevertheless we would then just require "blocks must be independent of each other" and "at mostly one block must respond to an
external call". (Ok, granted, I did put it a bit to the extreme, for the sake of a cheap argument (;-))

> 
>>>> I'm looking at the whole structure like I would at an OO object instance. I do have an instance that has provisions of calling some
>>>> methods on local "data" instances (fields). The signature is quite clear. The actual implementation might cover a wider range of known
>>>> or not yet known behaviour. Nevertheless _control_ is delegated to such methods. Of course you could restructure the whole thing to
>>>> always call the instance data method and cause it to use some implementation parts of the (former) entry class making the former
>>>> instances derived classes in the new structure. This is what I did understand to be the equivalent of the solution you suggested. But
>>>> I doubt this will lead to a "natural" structure of responsibilities, especially if not all methods of the original instance data
>>>> classes should be exposed to (arbitrary) callers of M.
>>> I also try to think about whole problem in OO terms but apart from plain concepts of OO I try to
>>> take into account best practices like isolation.
>>>
>>> If you are programming in any OO language you probably consider global variables a bad practise as
>>> well as passing heavy amounts of data all over around. You strive for design where methods of
>>> objects are specialized, don't you?
>> I can not see where in the current structure we would have something resembling global variables.
> 
> The global variable here is the request coming from browser. Your original idea was to let every
> block access the whole request coming from browser. I think it's a bad idea and it resembles global
> variable to some extent.

Here, I think, is one of the basic differences in our perspectives. But, see below...
(In short: From a programming language perspective, you are right, but this is a different domain, here...)
> 
>> I'm just trying to apply some abstract concepts (from OO world in this case) to SSP.
>>
>> We are providing derivation and overriding semantics for "data".
>> A block currently exposes (some) pipelines to be "called" (used) from other blocks.
>> Any such pipeline will get called on the purpose of retrieving the "data" associated with said pipeline.
>> A contract will usually include basic semantics and format of such data.
>>
>> This will suffice for most cases. nevertheless, as long as a pipeline is allowed to provide aspects of control (flow via map:call
>> (function or continuation) the actual data just is a side effect (the activities carried out while retrieving the data result are a
>> side effect in the first place but as the initial side effect is the main interest here, view on things and attributions should be
>> changed).
>>
>> This is close to the difference on stateless and stateful classes with OO languages. (and please everyone, do not start a discussion
>> on whether stateless or stateful designs are superior).
>>
>> Current use of SSP supports "functional" calls, where result just depends on parameters. The question effectivly no is, whether SSP
>> should/will/must support any stateful semantics that cocoon itself has provisions for.
> 
> Yup, you have got the point. My own opinion is that it's the best if SSF did not support stateful
> semantics or at least it shouldn't be done the way when you pass everything everywhere.

OK, this causing SSF to not provide semantics of calls as is available using a browser.
(It's different.)

> 
>>> The second one is the most important because it destroys the whole idea of your M block. You
>>> probably wanted to have one nice, universal pipeline delegating generation of some parts of your
>>> pages to subordinate blocks. However, if some of them require M block to pass some data of original
>>> request it becomes kind of troublesome because you need to dynamically chose if you need to pass
>>> data or not. I bet you (or one of your young developers) will end up with passing everything to
>>> every block. That would be a complete disaster and contradiction of ideas behind SSF.
>> I'm fully aware, that in practice implementors will choose to just pass the complete request (at least for avoiding to seriously
>> consider what really would be needed or really should be available with subordinate blocks). This was reasoning behind my remark for
>> probably short circuiting such special case to improve performance. But risk of misuse by programmers might not be sound ground for
>> rejecting a major concept.
> 
> Let me explain my hesitation on making SSF more sophisticated. We already have something that you
> ask for in Cocoon, it's cocoon: protocol. It provides access to the data coming from original
> request when you make internal pipeline calls (internal requests). I tell you, it gave people a lot
> of freedom but at the same time it allowed them to bake sitemap full of crap that I wouldn't call
> the best practice but it's my own opinion. More objective part is a point of view of Cocoon
> developer. I'm such and believe me I become sick when I hear about cocoon: protocol because it (with
> it's friend map:mount) the main reason why we got to the point where almost nobody can maintain
> Cocoon's core because it is so complex. Yes, I'm speaking about implementation details but they are
> hell important to me and I think they should be important to the whole community because
> unmaintainable code will hurt everyone in the long term...
> 
> Personally, I don't believe that the functionality we are talking about can be implemented in a
> clean way leading to maintainable code. Moreover, I don't care that much when I can give you example
> of a much better architecture that Cocoon applications should take that doesn't require
> functionality of cocoon: protocol.
> 
> Anyway, I'm still open and willing to discuss. :-)
> 
>>> Imagine you have many, many roads crossing each other in one point. You can easily imagine that no
>>> matter how big crossroads you will build with how many fancy spiral ramps you will build it's going
>>> to be a major bottleneck in the whole traffic. What I suggest is to build more, smaller crossroads
>>> that will distribute the overall hindrance among many different places.
>> You're absolutely right, M is a major bottleneck and that is on purpose. M in my case (the central crossing) is just the big toll
>> station (to keep the picture)
>>>> Thus, if the current SSP already is creating some kind of request, what does prevent it to allow for setting up a POST (or any other
>>>> request method besides GET). Ok, there is some need for a syntax to make clear what kind of request should be performed, while the
>>>> regular use case would prefer following the request method that triggered the subsequent call. This will also provide a clean
>>>> transitive semantics for internal resources: some are best exposed using GET, others require POST, or other methods. And it is up to
>>>> any caller can ensure correct usage.
>>> There is a subtle problem with this approach. You mentioned servlet service calls in your previous
>>> e-mails so you are aware of the fact that internally they use POST requests to pass XML data to be
>>> processed. What if someone stumbles upon the idea of creating a service that requires access to the
>>> original request. Sure, let's pass the POST data coming from browser, but what the heck are going to
>>> do with the XML data that needs to be passed as well?
>> Sorry, here I got lost.
>> Where does XML data come in here?
>> For my (probably simplistic perspective) SSP is the method call runtime implementation for an object based structure based on cocoon
>> blocks. Calls my result from a "browser" or form other blocks (or probably from other sources e.g. web services...).
>> Naively, I assume a block should be provided the very same "calling" possibilities than a browser would have (and currently does have,
>> obviously).
> 
> Your probably should study this thread[1] and this[2] issue. Postable source is already implemented
> and used in 2.2's demos (for styling). Internally, postable source makes a POST request, more
> details you will find in referenced thread. How would you handle request coming from postable source
> when you would need to pass original post data coming from browser as well?
> 
>>>> URLs being resolved to pipelines somehow resemble accessing fields in OO languages. It sounds quite unnatural to just invert
>>>> algorithmic control to cleanly get access to some (probably overloaded) fields when delegating control would be more appropriate?
>>>> (Leaving the question unanswered whether my initial use case would serve a good argument example here).
>>> It's more making methods call than accessing to the fields. What if you take argument of redundant
>>> parameters (data) passing?
>> Can you explain a bit more here on what you have in mind?
> 
> Still the same: passing whole request to the called is like passing parameters to the methods that
> completely do not need them.
> 
> [1] http://thread.gmane.org/gmane.text.xml.cocoon.devel/67477/focus=67480
> [2] https://issues.apache.org/jira/browse/COCOON-2046
> 
> Best regards.
> 

I do understand your reluctance for adding powerful, but complex functionality that easily cause developers to get trapped into abuse
and misuse or at least bad practice. I still do not have a good solution either. I just try unifying semantics as I do hope this might
lead to reduced complexity (at least with respect to usage) and easier understanding.

Let's start with the "global variables" aspect. This is, from my point of view, not so much a kind of global variable topic than lack
of strict typing. This, however, comes quite naturally from the semantical environment: web applications (the ones using http and html
or alike, not talking about SOA or similars) are as of now not really typed. Users call entry points (URLs) with no or probably some
(request) parameters. We also have different "calling modes" (GET, POST,....) that (at least for now) we will ignore (or consider
either as implied parameter or different method namespace, but the former will suffice...)

I still do keep up my expectation that a block should not really depend on whether it will be called "internally" or from the
"outside". Thus any (regular) call should look as it would have originated from a browser. In the end, causing a *requirement* for
untyped interface of a block.
On the other hand it is quite helpful to programmers, if it is known what (request) parameters can be provided and will be honored in
relation with a block or a specific URL. This will add some basic signature to block entry points (even helpful to "normal" users)

"Calling" a URL usually not depending (and assuming anything) on whether the block does expect a GET or a POST call. Any checking and
validating of calls and parameters is up to the application. The only area that in current applications will depend on getting called
by POST is forms (at least for non-trivial field values; anything else could also result from a GET with request parameters?)

This is quite different to things around "postable source". If I did get it correctly, this allows passing SAX events from a local
pipeline to being processed by a "remote" one. This is something a "normal" browser never will. Thus postable source is a different
kind of interface a block may exhibit and quite a different story here.

So we now have some different semantics for different call origins:

1. external   (browser) :  GET and (POST (mostly treated similar to GET)
2. intra block:   POST (+ postable source), (GET is transformed to POST here?)
3. block local: same as 2. above

(This brings up an earlier topic of mine: SSF currently does not provide for distinguishing 1-3, preventing use of
"internal-only=true" pipelines with SSF)

Thus,
- I do not have any problem with passing any parameter to any block along SSF calls.
  It keeps the contract already available with calls from browser.
- I suspect, there currently is a semantic difference with intra block calls and calls from browsers
  that a developer has to cope with when implementing blocks (and finally adding to the complexity of developing blocks)

Ideally, I wish for a clear interface of a block stating data/methods/... available to external callers (public),
to other blocks (protected or default in Java terms, not quite a match, so), and to local callers (private).
(And saving me as developer the pain of adding endless lists of "&amp;PARAM={request-param:PARAM}" to any servlet: URL used within a
sitemap.)

So far, not much of a solution, but lot's of weird unfinished thoughts.

Regards,
Rainer

-- 
Rainer Pruy
Managing Director

Acrys Consult GmbH & Co. KG
Untermainkai 29-30, D-60329 Frankfurt, Germany
Phone: +49-69-244506-0 - Fax: +49-69-244506-50
Web: http://www.acrys.com -  Email: office@acrys.com
Registered: Frankfurt am Main, HRA 31151

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org