cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johan Stuyts <jo...@hippo.nl>
Subject Re: Experience with workflow at Hippo Webworks
Date Mon, 08 Mar 2004 11:03:14 GMT
On Mon, 08 Mar 2004 11:41:12 +0100, Unico Hommes <unico@hippo.nl> wrote:

> Johan Stuyts wrote:
>
>> On Fri, 05 Mar 2004 17:18:53 -0500, Stefano Mazzocchi 
>> <stefano@apache.org> wrote:
>>
>>> Johan Stuyts wrote:
>>>
>>>> Experience with workflow at Hippo Webworks
>>>> ==========================================
>>>>
>>>> At Hippo we used OSWorkflow to implement a workflow solution in a 
>>>> demo.
>>>> Below are our experiences.
>>>>
>>>> As people with different levels of experience are interested in 
>>>> workflow
>>>> I will start with a (very) brief introduction to workflow.
>>>>
>>>> Workflow introduction
>>>> ---------------------
>>>> Very simply put workflow serves two purposes:
>>>> - to determine who can do what at which time with an object;
>>>> - to generate a list of pending tasks for users.
>>>>
>>>> An example of the first is that an editor (who) can only publish (do
>>>> what) a document (an object) after a writer has asked for a review (at
>>>> which time).
>>>>
>>>> The lists of documents to be reviewed is an example of a pending task
>>>> list for an editor.
>>>>
>>>> Each object type can have its own specific workflow.
>>>>
>>>> The demo workflow
>>>> -----------------
>>>> The demo we created has a workflow for a basic document type, a web
>>>> page. I have attached a diagram of it.
>>>>
>>>> A document gets created by a writer. The writer is not allowed to
>>>> publish his document directly, he has to ask the editor for review.
>>>>
>>>> The editor can easily review documents because we generate a list of
>>>> documents waiting for review. The editor can click on the document and
>>>> can either approve or disapprove. If the document gets approved it is
>>>> published on the public server.
>>>>
>>>> If the document gets disapproved the writer can not ask for a review
>>>> without editing it first. Editing the document when it has been 
>>>> approved
>>>> will bring the document back to the editing state too. After making 
>>>> his
>>>> changes the user can ask for a review of the new version.
>>>>
>>>> Implementation
>>>> --------------
>>>> For the document repository we use Slide. For the workflow engine we
>>>> used OSWorkflow. We connected these two using Slide interceptors.
>>>
>>>
>>> wow, supercool!! I want it :-)
>>>
>>>> When a document is created the interceptor checks to see whether a
>>>> workflow already exists. It does this by retrieving the workflow ID 
>>>> from
>>>> a WebDAV property of the document. If it doesn't exist a new workflow 
>>>> is
>>>> created in the workflow store.
>>>
>>>
>>> Interesting terminology you use here: let me ask you this before we get
>>> confused: "workflow" is for you an instance of the model or the model
>>> itself?
>>
>>
>> I use the same term for both the model and the instance :">
>>
>>>
>>>> When our frontend retrieves the tree of documents, the interceptor 
>>>> will
>>>> retrieve the workflow for each document.
>>>
>>>
>>> Seems to be the instance. Ok, careful though, because normally people
>>> refer to workflow as the "model", not the instance.
>>
>>
>> I will be more explicit in further messages.
>>
>>>
>>>> Looking at the role of the user
>>>> the interceptor will determine which actions are enabled. The enabled
>>>> actions (including their display text and activation URLs) are set in 
>>>> a
>>>> WebDAV property of the document.
>>>>
>>>> For the generation of the pending task list we used the OSWorkflow 
>>>> query
>>>> API to generate the documents which are in the waiting-for-review 
>>>> state.
>>>> The approve and disapprove actions are passed to the frontend in the
>>>> same way as the commands for a writer.
>>>>
>>>> Not all actions are directly shown in the menu, because the user 
>>>> invokes
>>>> them implicitly. The edit action for example is invoked by the
>>>> interceptor each time the user saves the document.
>>>>
>>>> Issues
>>>> ------
>>>> We encountered issues with both slides and OSWorkflow during the
>>>> implementation.
>>>>
>>>> Before we used Slide, we used the Cocoon repository. The semantics of
>>>> the repository interceptors and the Slide interceptors is not the 
>>>> same.
>>>> With the repository interceptor we were able to add a property to the
>>>> document in postStoreContent(...). In Slide we had to do this in
>>>> preStoreContent(...).
>>>
>>>
>>> IMHO, makes more sense to add metadata in pre-saving than in
>>> post-saving. It's way more efficient for clustered environments.
>>
>>
>> I dont't care what's better. I just thought that two technologies used 
>> heavily in Cocoon having different semantics for the same concept was 
>> confusing.
>>
> Well I care. The interception mechanism in the repository was crafted 
> after Slide's. If there are diverging semantics or you need extra 
> functionality it can be fixed quite easily.
>
>>>
>>>> Apart from that the Slide interceptors work very well, but (in the
>>>> version of Slide we used) they get called a lot. A single store of a
>>>> document invoked preStoreContent(...) and postStoreContent(...) 
>>>> multiple
>>>> times.
>>>
>>>
>>> well, this is a bug then. there should be a way to connect to an atomic
>>> event for a content store... you might want to bring this up on 
>>> slide-dev
>>
>>
>> OK. I will look into this (making sure we don't add the same 
>> interceptor multiple times).
>>
> Interception in Slide is quite low level. For instance, when verioning 
> is turned on one WebDAV PUT will typically result in 2n pre- and 2n post 
> store operations calls where n is the number of versions the resource 
> will have after transaction commit. (I even suspect the factor 2 in that 
> is actually the number of stores configured for the scope the resource 
> is in but I'd have to check that). Anyway, I think that the interception 
> model has been superceeded by Daniel Florey's recent work on event in 
> Slide but I still have to look into that too.
>
>>>
>>>> OSWorkflow performed great too. The only disadvantage was the 
>>>> complexity
>>>> of state machines that can be expressed. As you can see in the 
>>>> attached
>>>> diagram nested states are used. OSWorkflow does not support these.
>>>
>>>
>>> The more I hear about workflows, the more I think that writing them 
>>> with
>>> flow and continuations makes more sense than writing a finite state 
>>> machine.
>>
>>
>> I don't like procedural code to handle complex state. You wind up with 
>> a lot of if-statements and it is difficult to determine what happens 
>> when a particular action gets invoked. A state machine has a lot of 
>> context: I am in state X, so all operations on this state and its 
>> parent states are valid. A state machine also hides a lot of 
>> implementation details. No need to check what the value of the 
>> current-state variable is.
>>
> Continuations do that even better. There isn't even an explicit notion 
> of state at all. I don't want to rule out procedural code for handling 
> conditional logic in workflow yet. I think the State pattern Guido 
> described very effectively solves the nested case complexity. I am very 
> interested in pursuing this more.
>

I am more interested in how easy it is to read an implemented state 
machine. Of course it is possible to implement a workflow using 
continuations, but how easy can I visualize this and use this to discuss 
it with customers? Keeping a diagram up-to-date each time some JavaScript 
and/or Java changes is cumbersome.

If you put a state machine in an XML document using a well-defined schema, 
it should be easy to understand and have a complete overview during the 
definition of the state machine (I see a state machine as a single concern 
as I posted before). And it is easy, if you are a diagram-layout expert 
;), to generate an up-to-date diagram.

>>>
>>>> Although the attached workflow does not contain parallel states, we
>>>> think it might be needed for some document types. A newsletter for
>>>> example follows the same workflow as the attached one. But parallel to
>>>> this is a mailing workflow for sending it to the newsletter 
>>>> subscribers.
>>>>
>>>> In the mailing workflow the user can send a test email of the current
>>>> version to himself. When he is satisfied he can send the final version
>>>> to the newsletter subscribers. After this, he can neither send a test
>>>> email nor send it to the subscribers.
>>>>
>>>> But what to do if a mistake in the newsletter is found after sending 
>>>> it
>>>> to the subscribers? The subscribers won't be happy to receive another
>>>> copy, so the mailing actions should stay blocked. But not correcting 
>>>> the
>>>> newsletter on the website looks sloppy. Therefore the
>>>> editing/reviewing/publishing workflow has to remain active.
>>>
>>>
>>> this screams for long-lasting continuations!
>>
>>
>> How would you handle parallel states using continuations? If you want a 
>> unique continuation point for each possible combination of states, the 
>> number of continuations points will explode.
>
>
> Won't there always be one continuation per workflow instance?  That's 
> constant no?

If you do this you will have to keep a state variable (to a Java object) 
because you return to the same point. Using the GoF State pattern works 
great for simple state machines and I use it a lot. But the pattern does 
not talk about nested and/or parallel states, which become 
incomprehensible when coded in Java; the state machine logic gets 
intermixed with the document logic.

>
> --
> Unico
>

-- 
Johan Stuyts

Mime
View raw message