incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bobrovsky, Konstantin S" <konstantin.s.bobrov...@intel.com>
Subject RE: JavaStackFrame/JavaLocation local variable support
Date Wed, 01 Jul 2009 09:20:04 GMT
Hi Nicholas,

> Even at HotSpot safepoints, the code being executed often looks nothing 
> like the source.  In particular, compiled code may be heavily inlined.  
> Many instructions are just gone, and those that remain are moved up and 
> down, smearing methods together so that, variables aside, in general we 
> couldn't possibly tell you what method -- or even what class -- you're 
> in because you are in several at once.

NOTE: all my speculations below are for the Hotspot server compiler (C2) only. I don't know
how safepoints are supported by the client (C1) Hotspot compiler.

Inlining does not harm safepoints. C2 compiler annotates each safepoint with so-called DebugInfo
(serialized together with method's executable image), which records an entire in-lining hierarchy
for this particular safepoint, as well as mapping of a JVM state of each in-lined method at
this safepoint to memory locations/registers. (Prior to that, each JVM state element of every
method in the in-lined hierarchy is kept as an IR node by JIT, with all these nodes being
an input to the SafepointNode) Thus, at every safepoint runtime knows how to re-construct
actual JVM state for the method the safepoint is in as well as for all the methods up the
inlining hierarchy. This is particularly used by the de-optimization technique, when even
a heavily optimized method maybe replaced by its interpreted version (with necessary chain
of callers in case of in-lining) on-the-fly at a safepoint. De-optimization, being a critical
and outstanding feature of Hotspot, is actually the only reason why JVM state mapping is maintained
and saved together with compiled code. 

> Not sure how stack backtraces for exceptions work -- perhaps that
> suppresses some optimizations?

For each 'athrow' C2 server compiler creates a SafepointNode during the parse stage. So, as
the comment above implies, at each 'athrow' site runtime has full information about what was
in-lined here. For exceptions triggered not by an 'athrow' (e.g. implicit null pointer exception),
things are slightly more complicated, but there is always a 'serialized' safepoint at the
bottom, which can provide the in-lining details.

Thanks,
Konst

Intel Novosibirsk
Closed Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park, 
17 Krylatskaya Str., Bldg 4, Moscow 121614, 
Russian Federation
 

-----Original Message-----
From: Nicholas.Sterling@Sun.COM [mailto:Nicholas.Sterling@Sun.COM] 
Sent: Wednesday, July 01, 2009 3:31 PM
To: kato-spec@incubator.apache.org
Subject: Re: JavaStackFrame/JavaLocation local variable support

Joining this conversation late, so forgive me if this isn't as relevant 
as I think it should be.  :^)

Even at HotSpot safepoints, the code being executed often looks nothing 
like the source.  In particular, compiled code may be heavily inlined.  
Many instructions are just gone, and those that remain are moved up and 
down, smearing methods together so that, variables aside, in general we 
couldn't possibly tell you what method -- or even what class -- you're 
in because you are in several at once.  Of course the debugger doesn't 
have this problem because as soon as you point the debugger at a method 
HotSpot abandons the compiled version and uses the interpreted version, 
or at least something optimized less aggressively.

At least that's my understanding; I'm happy to be corrected.  Not sure 
how stack backtraces for exceptions work -- perhaps that suppresses some 
optimizations?

Nicholas



Stuart Monteith wrote:
>
>
> Steve Poole wrote:
>> On Fri, Jun 26, 2009 at 11:41 AM, Stuart Monteith 
>> <stukato@stoo.me.uk>wrote:
>>
>>  
>>> Hi,
>>>   I was wondering what peoples thoughts were regarding program 
>>> counters,
>>> line number table and variable tables.
>>> There is a tension between most users of the Kato API and the JDI 
>>> connector
>>> and its obligations towards supplying the information JDWP requires.
>>> JDWP, for the most part, would like to know the location of a stack 
>>> frame,
>>> i.e. a program counter normally, and using that to look up the variable
>>> tables and line number tables.
>>>
>>>     
>>
>> I do see JDWP as a major use case  for us so we must make sure that 
>> our JDI
>> connector is first class.
>>
>>   
> Agreed. I know that the katoView tomcat commands would benefit too - 
> the FFDC scenario.
>>> Some changes have been made to the API to supply the local variables
>>> (through JavaStackFrame.getVariable(int)) and their locations/types 
>>> through
>>> JavaMethod.getVariables().
>>> However, we haven't resolved the issue of the program counter or the 
>>> line
>>> numbers.
>>>
>>> Just now the line numbers are available through
>>> JavaLocation.getLineNumber(), where available. However, JDWP never 
>>> asks for
>>> a stack frame's line number, it maps from a stack frames
>>> location to a line number using the line number table from a stack 
>>> frame's
>>> method.
>>>
>>>     
>>
>>  
>>> So, should we forgo having the simple JavaLocation.getLineNumber() 
>>> and only
>>> supply the line number table (where appropriate)?
>>>     
>>
>>
>> I was thinking "So what's the use of the getLineNumber method? "   but
>> outside the JDWP scenerio it does enable simple access to the 
>> linenumbers
>> (ie via the xpath approach)   The question is how much use that is 
>> and what
>> we'd be encombering the implementations with.       Since what the  
>> the JDI
>> does is "standard" in its mapping then the RI could provide that code 
>> for
>> implementors to use.
>>
>>   
> I think the concern I have is that not all implementations would be 
> able to supply a program counter or a line number table.
> For instance, the hprof file stores only the line numbers. However, we 
> shouldn't get too hung up on this as our implementation
> for hprof was always going to be limited.
> It is important that we supply all of the necessary information, and 
> supply either helper methods on top or within API to make
> it more digestible for the majority of implementations.
>
>>  
>>> Of course, having a "getProgramCounter()" method would be useful, 
>>> but what
>>> should we do for compiled methods? There is a strong requirement for 
>>> us to
>>> return the contents
>>> of local variables in compiled methods as well as interpreted methods.
>>> However, that requires synthesizing a bytecode program counter to 
>>> retrieve
>>> the correct variables, which implies
>>> that line numbers could be generated too. However, as with C, etc, the
>>> debugging information derived from optimized code is usually 
>>> inaccurate.
>>>     
>>
>>
>>  
>>> For line numbers, I imagine we'd either have the line numbers or not if
>>> they are inaccurate. But for local variables, it would be sensible 
>>> to alter
>>> the variable table information to suit the
>>> optimized code, to give a consistent picture.
>>>
>>>     
>>
>> I think we need to examine this in more detail -  got an example?
>>
>>
>>   
> My experience of the JIT is somewhat limited, but certainly when 
> debugging C programs with optimization,
> it is usual that variables are optimized out, loops unrolled, code 
> reordered, such that  the variable contents and
> line numbers don't match the source. Having said that, I'm sure there 
> are others who could make more authoritative
> comments on this area.
>
> Take:
>    for(int a=0, b=0; a<10; a++) {
>       b = a*2;
>       array[a][b] = array2[a];
>    }
>
> if the compiler did this:
>
>    for(int a=0; a<10;a++) {
>       array[a][a*2] = array2[a];
>    }
>
> Then the local variable "b" would no longer exist in any meaningful 
> sense. My suggestion would be to remove "b" from the
> variable table. Of course, we could have two stack frames in the same 
> method with different levels of optimization, but I believe that's
> probably still an issue anyhow.
>
>
>>  
>>> Regards,
>>>   Stuart
>>>
>>>
>>> Stuart Monteith wrote:
>>>
>>>    
>>>> Hi,
>>>>   I've been looking at local variables in relation to the JDI 
>>>> connector.
>>>> For the BOF at JavaOne we'd like for there to be a prototype of  local
>>>> variable support in the API. I've been looking at what JDWP 
>>>> requires as we
>>>> would have to be able to satisfy its queries using the Kato API. 
>>>> This has
>>>> made me lean towards exposing the variable table and have us 
>>>> retrieve the
>>>> local variables from the stack frames by slot number.
>>>>
>>>> So my suggestion for the API is this:
>>>>
>>>> ---------------------------------
>>>>
>>>> JavaMethod
>>>> -------------
>>>>
>>>> // returns all local variables
>>>> // empty if there are no variables.
>>>> Iterator<JavaVariable> getVariable() throws DataUnavailable;
>>>>
>>>> JavaVariable
>>>> -------------
>>>>
>>>> // Local variable's name
>>>> // throws DataUnavailable if the variable was derived from bytecode 
>>>> and so
>>>> the name is unknown. Caller is free to make a name up.
>>>> String getName() throws DataUnavailable;
>>>>
>>>> // The local variable's signature in JNI format.
>>>> String getSignature();
>>>>
>>>> // The start of the local variable's scope within the bytecode.
>>>> int getStart();
>>>>
>>>> // The number of bytes this variables scope covers over the bytecode.
>>>> int getLength();
>>>>
>>>> // The slot this variable occupies. Passed to 
>>>> JavaStackFrame.getVariable()
>>>> to retrieve the contents.
>>>> int getSlot();
>>>>
>>>>
>>>> JavaStackFrame
>>>> ------------------
>>>>
>>>> // Gets the value of a variable from a stack frame.
>>>> // Returns a JavaObject for an object reference, null for a null 
>>>> object
>>>> reference. Primitives are returned as boxed primitives.
>>>> // throws CorruptDataException if object reference is incorrect, or 
>>>> if the
>>>> float or double are set to invalid values.
>>>> // throws DataUnavailable if this method is not supported or if 
>>>> stack not
>>>> in correct state to return variables.
>>>> // throws IndexOutOfBoundsException if an invalid slot number if 
>>>> passed.
>>>> Object getVariable(int slot) throws CorruptDataException, 
>>>> DataUnavailable,
>>>> IndexOutOfBoundsException;
>>>>
>>>>
>>>> ---------------------------------
>>>>
>>>> The bytecode offset can be calculated with:
>>>>   JavaLocation.getAddress() - (
>>>> JavaMethod.getBytecodeSections().next().getBase().getAddress())
>>>> but I think that might be a little too tedious, and doesn't allow
>>>> cleverness with JITted frames. So we will probably have to add:
>>>>
>>>> // Return program counter in bytecode.
>>>> int JavaLocation.getBytecodePC();
>>>>
>>>> alternatively the JavaVariable.getStart() would use absolute 
>>>> addresses,
>>>> which could conceivably work with JITed frames, if the tables are 
>>>> maintained
>>>> during compilation.
>>>>
>>>> We should also expose the line number table too as that will aid class
>>>> file reproduction and queries for line numbers based on bytecode 
>>>> program
>>>> counters.
>>>>
>>>> A slightly different scheme would have the 
>>>> JavaStackFrame.getVariable(int
>>>> slot) method look like:
>>>>   Object getVariable(JavaVariable var);
>>>> but I don't think it gains us much.
>>>>
>>>> Retrieving all of the variables would therefore look something like 
>>>> this:
>>>>
>>>> void dumpVariables(JavaThread thread) throws Exception {
>>>>   Iterator frames = thread.getStackFrames();
>>>>   while (frames.hasNext()) {
>>>>      JavaStackFrame frame = (JavaStackFrame) frames.next();
>>>>      JavaLocation location = frame.getLocation();
>>>>      JavaMethod method = location.getMethod();
>>>>      int pc = location.getBytecodePC();
>>>>           System.out.println(location.toString()+":");
>>>>
>>>>      Iterator variables = method.getVariables();
>>>>      while (variables.hasNext()) {
>>>>         JavaVariable variable = (JavaVariable) variables.next();
>>>>
>>>>         if (pc >= variable.getStart() && pc <=
>>>> variable.getStart()+variables.getLength()) {
>>>>            Object value = frame.getVariable( variable.getSlot());
>>>>                       System.out.println("\t"+ 
>>>> variable.getSignature()+"
>>>> "+variable.getName()+" = "+ value.toString());
>>>>         }
>>>>      }
>>>>   }
>>>> }
>>>>
>>>> Let me know what you think,
>>>>   Stuart
>>>>
>>>> Stuart Monteith wrote:
>>>>
>>>>      
>>>>> Hello,
>>>>>   With Steve's work on JVMTI/python coming along, the issue of 
>>>>> what to do
>>>>> about local methods is coming up. Currently there is no means to 
>>>>> determine
>>>>> the names and values of local variables through the current API.
>>>>>
>>>>> The most obvious way of implementing this is to have the API do 
>>>>> all of
>>>>> the processing by exposing the variables as name and value pairs.
>>>>>
>>>>> For example:
>>>>> interface JavaStackFrame {
>>>>>   List<LocalVariable> getLocalVariables();
>>>>> }
>>>>>
>>>>> where:
>>>>>
>>>>> interface LocalVariable {
>>>>>   String getName();
>>>>>   Object getValue();
>>>>> }
>>>>>
>>>>> Where the value is a JavaObject, or an boxed primitive.
>>>>>
>>>>> The other extreme is for the necessary information to be made 
>>>>> available
>>>>> for the callers of the API to generate this information themselves.
>>>>> This would mean properly exposing:
>>>>>   Program Counter - currently we have JavaLocation.getAddress(), 
>>>>> which is
>>>>> an address in memory, rather than a bytecode program counter. For 
>>>>> JITted
>>>>> frames we'd still need the bytecode program counter.
>>>>>   Local variable table - this is to determine which variables 
>>>>> there are,
>>>>> their types and their indexes into the local variable array
>>>>>   Local variable array - the contents of the local variables need 
>>>>> to be
>>>>> exposed, and their proper types should be returnable (JavaObject, 
>>>>> int, etc).
>>>>>
>>>>> Doing it that way might be beneficial for more user stories, there is
>>>>> more information available to reconstruct the class file, for 
>>>>> instance.
>>>>> There is also the small matter of what to do when the local variable
>>>>> table is not available. When the API exposes all that it knows the 
>>>>> values
>>>>> might still be retrievable, although I have my doubts as to how 
>>>>> useful that
>>>>> would be if you don't know the types.
>>>>>
>>>>> Thoughts?
>>>>>   Stuart
>>>>>
>>>>>
>>>>>         
>>
>>   


Mime
View raw message