beehive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rich Feit <richf...@gmail.com>
Subject Re: page flow destroy, controls, and deadlock
Date Fri, 07 Apr 2006 15:00:55 GMT
OK, thanks for the info.  It sounds to me like the deadlock is caused by 
ordering issues when locking the control bean and BC.gHL, and that the 
page flow synchronization is simply masking this root problem.  Is that 
what's going on, or am I misunderstanding?

Rich

Eddie O'Neil wrote:
>   Yeah -- that's right the deadlock is caused by lack of
> synchronization during "destroy" on the second valueUnbound event.  I
> think (but haven't confirmed yet) that the same thing could happen
> with different implementations of the
> ServletContainerAdapter.ensureFailover method if it calls
> "session.setAttribute(...)".  Just a hunch, though, and I need to
> confirm this.
>
>   Back to the deadlock, absence of synchronization on the "destroy"
> lifecycle call allows a second thread to enter into the
> ControlBeanContext which can cause the thread to deadlock with a
> second thread that's already in the ControlBeanContext.
>
>   
>> I don't think that's an option here, since it voids the whole purpose of
>> DeferredSessionStorageHandler
>> (http://mail-archives.apache.org/mod_mbox/beehive-dev/200601.mbox/%3C43D7AC11.8060207@gmail.com%3E
>> ) -- am I understanding you here?
>>     
>
> Yeah, that makes sense -- just wanted to rule this out as an option.  :)
>
>   
>> Otherwise, I wonder if there's a way to track this without adding state
>> to the page flow instance.  Possibly in a request attribute?  Just a
>> thought...
>>     
>
> Sure -- this would certainly work.  It could take the form of a set of
> attribute names that should be persisted / failed over, but that
> should ignore lifecycle methods.  Will look at this and get back to
> you.
>
> Thanks for the comments...
>
> Eddie
>
>
>   
>> Rich
>>
>> Eddie O'Neil wrote:
>>     
>>>   Gotcha -- thanks for the info; glad to know I wasn't entirely out in
>>> left field.  :)
>>>
>>>   Given this and putting the issue of the session expiration aside for
>>> a second, I think that root cause of the problem is that it is
>>> possible for the DeferredSessionStorageHandler to run the
>>> "valueUnbound" method code on a PageFlowController twice in these two
>>> circumstances:
>>>
>>>   - DeferredSessionStorageHandler::setAttribute / remove Attribute
>>>   - DeferredSessionStorageHandler::applyChanges @ line 221
>>>
>>> The first runs in "real time" with the removal or replacement of a
>>> Page Flow in the session and runs inside of the lock, but the second
>>> is the deferred part that runs at the end of a request to persist
>>> changed attributes in the session.  The "applyChanges" then makes
>>> calls to set / remove the Page Flow attribute from the real
>>> HttpSession, and this triggers the "valueUnbound" event and ultimately
>>> an "onDestroy".  This latter step is unsynchronized and thus causes
>>> the deadlock.
>>>
>>>   If the Page Flow is stored in the session and is an
>>> HttpSessionBindingListener, any deferred calls are going to trigger
>>> this event.  In order to keep consecutive JPF invocations in the same
>>> order, the onDestroy of the previous JPF needs to run before onCreate
>>> of the next JPF.  The cheap way to fix this would be to track the
>>> state of "valueUnbound" as a private boolean inside of the
>>> PageFlowController.  Then, when the second valueUnbound call is made,
>>> the event will be a no-op on the JPF.
>>>
>>>   Another solution would be to tunnel down into the real HttpSession
>>> and make JPF changes "live" against the session rather than deferring
>>> them.
>>>
>>>   Have an opinion either way?
>>>
>>> Eddie
>>>
>>>
>>>
>>> On 4/6/06, Rich Feit <richfeit@gmail.com> wrote:
>>>
>>>       
>>>> Funny, I'd drafted a longish email earlier asking about the BeanContext
>>>> lock, because it sounded like the *fundamental* issue (ordering problems
>>>> in locking that and the control bean instance itself).  But it's not, or
>>>> at least it's unrelated.  In a way, that's good.  :)
>>>>
>>>> Regarding this:
>>>>
>>>>         
>>>>> Thread A is execution an action on JPF X
>>>>> Thread B is destroying JPF X en route to running JPF Y
>>>>>
>>>>> From the code in PageFlowController.persistInSession (transitively
>>>>> down to StorageHandler.removeAttribute), it seems like a lock is held
>>>>> on the "current" JPF that would prevent this.  Right?
>>>>>
>>>>>           
>>>> That's correct.  The only time onDestroy() should ever run without a
>>>> lock on the page flow instance is when the session is expiring.  That's
>>>> the situation where the session itself is already locked, so it's
>>>> dangerous to subsequently lock the page flow instance (while another
>>>> thread may be accessing the session through synchronized get() after
>>>> already locking the page flow instance).  The user could also simulate
>>>> this situation by explicitly removing the page flow session attribute,
>>>> but of course that's not something we should try to support.
>>>>
>>>> Let me know if you see/find any holes in that, though...
>>>>
>>>> Rich
>>>>
>>>> Eddie O'Neil wrote:
>>>>
>>>>         
>>>>> Rich--
>>>>>
>>>>>   Having looked into this more now, I've got a little more detail
>>>>> about what's happening:
>>>>>
>>>>>   There are two issues intertwined here.  The first issue is related
>>>>> to the ControlBeanContext implementation class and its usage of a
>>>>> global, VM-wide lock.  At issue is the fact that all
>>>>> ControlContainerContext implementations inherit from a
>>>>> BeanContextServicesSupport base class.  This class (and another in
>>>>> that hierarchy) use a the VM-wide object
>>>>> BeanContext.globalHierarchyLock for synchronization in two situations:
>>>>>
>>>>>   - adding / removing a control to / from a context
>>>>>   - adding / removing / getting a "service" from a context
>>>>>
>>>>> This acts as a choke point for all Control references and is what
>>>>> shows up in the locking traces in a previous mail.  This is
>>>>> problematic, but it's also orthogonal (I believe) to the issue with
>>>>> the "destroy" lifecycle method on a JPF.  Ultimately, I think that the
>>>>> JDK assumed that the classes in java.beans.beancontext.* would be used
>>>>> to run a single hierarchy of controls on a VM.  Beehive's usage
>>>>> pattern is a little different in that in enterprise applications, many
>>>>> hierarchies of beans run on a VM (one per JPF, in NetUI's case).  As
>>>>> such, a global lock doesn't fit well.
>>>>>
>>>>>   So, hopefully that provides some background on the
>>>>> BeanContext.globalHierarchyLock.
>>>>>
>>>>>   That being said, I think that the deadlock is orthogonal to this
>>>>> problem as the "destroy" lifecycle method on a JPF can be invoked
>>>>> without waiting for a lock on the JPF instance itself.  The result is
>>>>> that two threads can actually run through a JPF simultaneously.
>>>>>
>>>>>   With Daryl's changes in February, the ControlContainerContext object
>>>>> for a Page Flow is a class-level field, and locking is performed on
>>>>> this object at:
>>>>>
>>>>>   - create time
>>>>>   - action execution time
>>>>>   - page rendering time (JSP and Faces)
>>>>>
>>>>> Guess that the question is whether this situation is protected:
>>>>>
>>>>> Thread A is execution an action on JPF X
>>>>> Thread B is destroying JPF X en route to running JPF Y
>>>>>
>>>>> >From the code in PageFlowController.persistInSession (transitively
>>>>> down to StorageHandler.removeAttribute), it seems like a lock is held
>>>>> on the "current" JPF that would prevent this.  Right?
>>>>>
>>>>>   If that's true, then I've got a little more info to go on in
>>>>> investigating this problem.
>>>>>
>>>>>   Thanks for any info...  :)
>>>>>
>>>>> Eddie
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 3/31/06, Eddie O'Neil <ekoneil@gmail.com> wrote:
>>>>>
>>>>>
>>>>>           
>>>>>>   Good questions -- let's see if I can explain the threads more
>>>>>> clearly in ASCII.  :)
>>>>>>
>>>>>>   What's happening is this (note, both threads operate on the *same*
>>>>>> instance of a JPF):
>>>>>>
>>>>>> Thread 1 (executing a page flow)              Thread2 (destroying
a page flow)
>>>>>> acquire lock on FooPage Flow
>>>>>>                                                              acquire
BC.gHL
>>>>>> acquire lock on BarControlBean
>>>>>>                                                              wait
for
>>>>>> lock on BarControlBean
>>>>>> wait for lock on BC.gHL
>>>>>>
>>>>>> Because the Controls infrastructure is based on the BeanContext
>>>>>> (Services|Support|etc) code in the java.beans package of the JDK,
it
>>>>>> uses the BeanContextServicesSupport class as a base class for the
>>>>>> ControlBeanContext object in a Control.  This JDK class (BCSS) uses
>>>>>> the BeanContext.globalHierarchyLock (also in the JDK) to serialize
>>>>>> access to shared data structures like the list of "service" objects
in
>>>>>> a BeanContext, etc.  This is a lock that is static throughout the
JDK
>>>>>> (!).
>>>>>>
>>>>>> To answer your questions:
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> What grabs the lock on BeanContextServicesSupport.globalHierarchyLock?
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> The base classes for the ControlBeanContext object which delegates
up
>>>>>> to it's "super" in order to serialize changes and service requests
to
>>>>>> the classes that implement event listener support, etc.
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> What prevents deadlock in general between locking on
>>>>>>> BeanContextServicesSupport.globalHierarchyLock and BarControlBean?
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> Good question -- this seems like the crux of the issue.  In general,
>>>>>> this is protected by a Control container serializing access to the
>>>>>> container's ControlContainerContext object.  In this case, the problem
>>>>>> occurs because two threads are trying to setup and destroy the same
>>>>>> context object simultaneously.  If those setup and destroy calls
were
>>>>>> serialized against the ControlContainerContext object stored in the
>>>>>> HttpSession, presumably, the semantics would be maintained.
>>>>>>
>>>>>> My guess is that this deadlock occurred with a double page submit
or
>>>>>> quick refresh that caught the tail end of the previous thread and
the
>>>>>> beginning of the next request such that the threads interleaved in
>>>>>> this way.
>>>>>>
>>>>>> So, hopefully that helps explain what's happening.
>>>>>>
>>>>>> Ultimately, I'm not sure how to fix this yet and would appreciate
>>>>>> thoughts on the topic.  Would it be safe to acquire a lock on the
JPF
>>>>>> instance during onDestroy?  That would serialize the access to the
>>>>>> session scoped CCC object.
>>>>>>
>>>>>> Another way to go (and it's a lot of work) is to build a NetUI /
>>>>>> web-tier specific control container that doesn't leverage the *Support
>>>>>> classes in the JDK.  This would allow the container's implementation
>>>>>> to be tailored to the single-threaded nature of the web tier and
would
>>>>>> remove a lot of the performance implications around locking and
>>>>>> synchronized data structures.  But, I've not thought through this
yet,
>>>>>> either.  :)
>>>>>>
>>>>>> Eddie
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 3/30/06, Rich Feit <richfeit@gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> Looks like a tough one.  First of all, the problem with synchronizing
in
>>>>>>> onDestroy is also a deadlock issue: since onDestroy is called
from an
>>>>>>> HttpSessionBindingListener, the Servlet container may have a
lock on the
>>>>>>> session itself when onDestroy is called.  I don't think it's
mandated by
>>>>>>> the Servlet spec, but I don't think it's forbidden either.
>>>>>>> Unfortunately, this means that a deadlock can occur if code in
another
>>>>>>> thread synchronizes on the page flow instance first (as in an
action
>>>>>>> method invocation), then calls a method on HttpSession that synchronizes
>>>>>>> on the session object.  Again, I don't think this is mandated
by the
>>>>>>> spec, but it happens.
>>>>>>>
>>>>>>> Nothing strikes me off the bat, but I actually don't understand
the
>>>>>>> deadlock sequence below (I don't disagree -- just don't have
enough info
>>>>>>> to see it).  What grabs the lock on
>>>>>>> BeanContextServicesSupport.globalHierarchyLock?  What prevents
deadlock
>>>>>>> in general between locking on
>>>>>>> BeanContextServicesSupport.globalHierarchyLock and BarControlBean?
>>>>>>>
>>>>>>> Also, how is the same JPF being *created* by two separate threads?
>>>>>>>
>>>>>>> Rich
>>>>>>>
>>>>>>> Eddie O'Neil wrote:
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>>> Rich/Daryl--
>>>>>>>>
>>>>>>>>   Hey, I've run into a thread deadlock problem in the interaction
>>>>>>>> between Controls and Page Flow that happens in the following
>>>>>>>> circumstance:
>>>>>>>>
>>>>>>>> thread1: acquire lock on FooPageFlow (during FlowController.execute)
>>>>>>>> thread2: acquire lock on
>>>>>>>> BeanContextServicesSupport.globalHierarchyLock (JDK class)
>>>>>>>> thread1: acquire lock on BarControlBean (ControlBean.ensureControl())
>>>>>>>> thread2: wait for lock on BarControlBean (BeanContextSupport.remove())
>>>>>>>> thread1: wait for lock on
>>>>>>>> BeanContextServicesSupport.globalHierarchyLock (JDK class)
>>>>>>>>
>>>>>>>> So, the problem is that the same JPF is being both created
and
>>>>>>>> destroyed by two different threads.  It appears that the
"destroy"
>>>>>>>> phase of the JPF lifecycle is entirely unsynchronized and
can freely
>>>>>>>> acquire Control locks without having serialized access to
the Page
>>>>>>>> Flow itself.
>>>>>>>>
>>>>>>>>   My thought is to add a synchronization point in
>>>>>>>> JavaControlUtils.uninitJavaControls that locks on the
>>>>>>>> PageFlowManagedObject as this would serialize access to the
Page Flow.
>>>>>>>>  But, I seem to recall some threading issues with locking
on a JPF
>>>>>>>> during the "destroy" part of the lifecycle and don't want
to resurrect
>>>>>>>> those.
>>>>>>>>
>>>>>>>>   Any suggestions about how best to make this fix?
>>>>>>>>
>>>>>>>> Eddie
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>       
>
>   

Mime
View raw message