harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weldon Washburn" <weldon...@gmail.com>
Subject [drlvm][threading] H3010 (Stack Overflow Exception) -- when does this bug really have to be fixed?
Date Mon, 12 Mar 2007 16:05:13 GMT
I assigned H3010 to myself.  This test definitely demonstrates a bug that
needs fixing.  But its not clear when this bug must be fixed.  This really
brings forward a higher-level.  What to code this bug right now and when
would this bug be moved to "blocker" status?  I provide some observations to
start the discussion:

The bug is a Stack Overflow Exception happens from inside fast native helper
functions.  Fast native helpers do not setup the M2N stack frame which is
required to throw exceptions such as SOE.  Adding M2N setup to fast native
helper will unacceptably slow down the system.

When running useful workload, a Stack Overflow that hits precisely on a fast
native has a very low probability.  Note the test in H3010 specifically
forces this event to happen with a very high probability.  In other words,
while the test is a good, it reflects a very rare event in nature.

Given the above, how about we address fixing the problem in two stages:

First stage: add an "assert(zero);" to the exception handler when it is
determined an SOE has happened inside a fast native.  This way, we will find
out quickly when an important workload hits this bug.  Once the assert(zero)
is added, we code H3010 as "later"

Second stage: When an application we care about hits the assert(zero), we
recode H3010 as "major/blocker".

While waiting for #2 above to happen, we discuss on harmony-dev ways of
designing the right fix.  For starts,  I think we should investigate a
design where the exception handler rewrites the entire register context so
that returning from exception handler revectors the instruction pointer to
recovery code that will somehow push the M2N frame on the stack and call
proper SOE throwing code.  I have not looked closely at how to do this.  I
am not convinced this approach will work.  However, I do think its worth a
try.  Thoughts?

Weldon Washburn
Intel Enterprise Solutions Software Division

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message