harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregory Shimansky <gshiman...@apache.org>
Subject [drlvm][explanation] Crashes due to unmapped memory on the stack on Linux x86_64
Date Sun, 13 Jan 2008 00:27:41 GMT
Hello

I've found out that all of my comments in bugs HARMONY-5019 (original bug 
report is HARMONY-3269) and HARMONY-3581 may be confusing and none of the is 
complete. The cause of bug is quite complex, so I decided to write this post 
for future (I hope there won't be any any more) references and as a complete 
explanation of the bug. I also hope that maybe someone who understands GPLed 
code discussed later may write a reply to this email.

The crash symptom is the stack with _Unwind_ForcedUnwind function in it, like 
shown in description for HARMONY-3581. The stack usually ends up with some 
weird address, often it is 0xdeadbeefdeadbeef. Instruction that crashes the 
one that tries to access this address (usually moves a value on this address 
to some register (I always saw RDX)), but it is not mapped, and therefore not 
accessible.

There are two causes that lead to calling _Unwind_ForcedUnwind. It is either 
throwing a C++ exception or pthread_cancel that cancels the thread. For C++ 
exception gcc calls libgcc_s function _Unwind_ForcedUnwind. For 
pthread_cancel, a signal handler that handles SIGCANCEL from pthread library 
tries to throw an uncatchable exception and unwinds the stack using 
_Unwind_ForcedUnwind in the way identical to C++ exception unwinding. Why it 
throws uncatchable exception I don't know, I didn't read glibc code to 
understand pthreads logic, it is under GPL. Probably it tries to determine 
the location where SIGCANCEL was received by the thread.

In any case, stack unwinding is started for some thread. On x86_64 stack 
unwinding is a tricky business because there are no stack frames as on x86. 
So libgcc_s code relies mostly on DWARF2 information. For some reason unknown 
to me even if there is a C++ exception handler on the stack, all of the stack 
is scanned by unwinding code. Unwinding code pretty well jumps from callee to 
caller on all of the code that I've seen, but it doesn't like it when caller 
is no longer a mapped code because it doesn't only analyze thread stack, it 
also tries to access the code instructions pointed to by return address. 
There is some heuristics for x86_64 architecture that requires to check the 
code, not only return address in the stack.

So, if there is any unmapped code on the thread stack, the crash is imminent. 
Crash handler doesn't usually help because it doesn't show any code down the 
stack if it encounters memory with no read permission. So usually the cause 
is not evident.

Why unmapped code happened to be in threads stack when execution applications 
on DRLVM is a separate question. In two cases there were bugs. First cause 
was because JVMTI agent was unloaded, and then its thread was canceled with 
pthread_cancel (HARMONY-5019). Second case was when interpreter library was 
unloaded too early, and a thread also was canceled with pthread_cancel 
(HARMONY-3581). In both cases interrupted threads were executing 
pthread_cond_timedwait and other functions down the stack were valid code. 
But the libraries' code that called this wait were somewhere on the stack, 
and therefore canceling such threads caused a crash.

These two bugs are fixed now, but something similar may happen in the future, 
and therefore I wrote this text. In no conditions there should be unmapped 
code in any thread stack, even if it works on Linux x86 or Windows of any 
architecture.

-- 
Gregory

Mime
View raw message