harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregory Shimansky <gshiman...@gmail.com>
Subject Re: [drlvm][threading] H3289 -- some JVMTI questions
Date Fri, 16 Mar 2007 19:54:36 GMT
Weldon Washburn wrote:
> On 3/16/07, Salikh Zakirov <Salikh.Zakirov@intel.com> wrote:
>>
>> Weldon Washburn wrote:
>> > On another topic, do we really have to use Thread.stop() in the
>> > implementation of vm_shutdown_stop_java_threads() ?  This seems way
>> > too hard.   All we really want to do is give back all the OS
>> resources.  We
>> > could simply call OS kill on each specific thread, return all memory 
>> via
>> > "free()", return all file handles, etc.  Note if there was no need to
>> allow
>> > multiple JVMs in a single address space to come and go, we could
>> simplify
>> > the above and basically just call process exit().
>>
>> HARMONY-3381 (already committed) replaces Thread.stop-like shutdown
>> algorithm
>> to the following one
>>
>> 1) set a shutdown callback, which executes hythread_exit() (which
>> delegates to
>> either pthread_exit() or ExitThread())
>>
>> 2) wait for a little while -- this is a little bit "unscientific" and
>> assumes,
>> that all actively running threads will reach a callback safepoint within
>> that
>> "little while". If assumption holds true, the remaining threads are 
>> highly
>> likely to be unactive threads, i.e. sleeping, waiting or blocked.
>> The key about a little while is that if all threads has terminated
>> promptly, we
>> don't need to wait any longer at all. It is currently implemented using
>> hythread_join() with timeout.
>>
>> 3) on the third step the remaining threads are terminated using
>> hythread_cancel() (which delegates to either of pthread_cancel() or
>> TerminateThread()) -- This step is inherently dangerous on Windows, as
>> terminated thread may be holding system-wide malloc lock, and subsequent
>> free()
>> on a main thread will deadlock.  Fortunately, the unscientific assumption
>> of the
>> step (2) makes this rather unprobable.
> 
> 
> 
> Yes, agreed this is dangerous and rather unprobable.  I reluctantly agree
> with this approach as we have other, more pressing bugs to fix.  
> Besides, it
> is fairly easy to know if we hit problems in this area since the bug will
> occur only during shutdown.

I think that in the future we can make ourself free from MSVC runtime 
which implements malloc with global lock, and then this problem will 
disappear. This will require implementing a VM-wide memory manager which 
isn't an easy task especially since we have JIT and GC as separate 
components.

> Using pthread_cancel() on linux has its own set of advantages and
>> disadvantages. By default, threads are created with cancellability 
>> enabled
>> and
>> in "deferred cancellability" state. It means that pthread_cancel() will
>> only
>> terminate thread when it reaches an explicit cancellation point, such as
>> pthread_cond_wait(). In this way, pthread_cancel() does not suffer from
>> the
>> malloc-deadlock issue.
>>
>> However, deferred cancellability brings its own problems, for example, as
>> pthread_mutex_lock is not a cancellation point, a pair of deadlocked
>> daemon
>> threads will still be alive during VM shutdown, and could probably result
>> in
>> EINVAL return and subsequent assertion failure if the monitors are
>> destroyed in
>> the main shutdown thread.
>>
>> So, the shutdown algorithm has been improved compared to the former
>> exception-based approach, like Thread.stop(), but still is not perfect.
>> Eugene and me discussed this and agreed to keep an open eye on shutdown
>> bugs.
> 
> 
> 
> The shutdown code has definitely been improved.  If it works for the 
> apps we
> care about, then its good enough for the time being.
> 
> There is still a field for experiments and possible improvements, e.g. 
> using
>> asynchronous cancellability or even using pthread_kill instead of
>> pthread_cancel to implement immediate thread termination.
> 
> 
> 
> Hmm... resource reclaimation is always the worst part of building systems
> software.  It will probably take a few iterations to get it right.  Somehow
> Linux can core dump a bad process and successfully recover all dangling 
> file
> handles, network sockets, page directories, thread info blocks, etc.  Maybe
> we look at how OS's do resource recovery for some ideas.
> 
> -- 
>> Salikh Zakirov
>>
>>
> 
> 


-- 
Gregory


Mime
View raw message