incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicholas Sterling <Nicholas.Sterl...@Sun.COM>
Subject Re: JSR 326: Post mortem JVM Diagnostics API - Developing the Specification
Date Thu, 01 Jan 2009 22:44:58 GMT

Hi, Steve.  Thought I'd use the free time over the holidays to finally 
respond to you!  :^)

First of all, this is great -- I like the idea of working from such 
ideas, and certainly appreciate all the work that you are putting into 
it.  Comments inline.

Nicholas



Steve Poole wrote:
> Greetings
>
> I had intended to post this document as a wiki page but we don't have a wiki
> yet!
>
> The following document is a work-in-progress.  This document is ultimately
> intended to capture the approach , process and scope for the development of
> this specification.    This is very much an initial brain dump so please
> feel free to point out any inaccuracies omissions, trade mark violations
> etc!
>
> I particular would like to receive feedback on the list of proposed sample
> tools.  The  tools are proposed as examples that could be developed to
> demonstrate the validity of the specification.  Its likely that not all of
> these tools are necessary or doable in the timescales of the first release.
>
> EG members - please let me know one way or another if you consider the list
> to be acceptable  and descriptive enough for us to start expanding into more
> detailed user stories.  Note we may not end up actually producing  all of
> these tools - but that should not stop us as Specification designers from
> defining the necessary user stories.
>
> Thanks
>
> Stebe
>
>
> ---------------------------------------
>
> JSR 326: Post mortem JVM Diagnostics API -  Developing the Specification
>
> Version Info
>
> Initial : Steve Poole  12 Dec 2008
>
> 1.0: JSR Objectives
>
> Define a standard Java API  to support the generation and consumption of
> post mortem or snapshot Java diagnostic artefacts. The specification will be
> inclusive of a range of existing "in field" diagnostic artefacts: including
> common operating system dump formats. The specification will balance the
> need to provide maximum value from existing artefacts, while considering the
> problem space expected to be encountered in the near and longer term future,
> with multiple language environments and very large heaps
>
>
> 2.0: Approach
>
> The design of the API will be driven directly by user stories.    To ensure
> coherence between user stories the stories will themselves be developed as
> requirements on several sample tools.  The project does not seek to create
> state of the art tools but recognises that having useful and useable sample
> tools is crucial in  demonstrating the validity of the API and will
> encourage others to build  alternative "better mouse-traps". These examples
> tools will also help define the  non-functional characteristics that are not
> easily translated into user stories -  characteristics such as scalability,
> performance, tracing etc.
>
> These tools and the  embodiment of the JSR specification: i.e. the
> reference implementation (RI) and test compliance kit (TCK),  are being
> developed as an Apache Software Foundation Incubator. The JSR Expert Group
> (EG) and the Reference Implementation developers will work together to
> define, develop and refine the specification etc.  The specification is
> intended to be incrementally developed and will always be available via the
> RI API Javadoc.   As the JSR moves though its various stages the
> specification at that point will be declared by referring to a publically
> visible form of the Javadoc and the associated repository revision.
>
>
> 3.0: Initial starting point.
>
> IBM is contributing  non proprietary portions of its  Diagnostic Tool
> Framework for Java (DTFJ) and associated  tools, samples , documentation and
> testcases.  This contribution is  only a seed. The JSR EG must review and
> amend this API as necessary to meet its requirements.   EG members can also
> contribute directly to the specification by providing testcases or  code
> samples etc.
>
>
> 4.0: API Structure
>
> Analysis of the types of dump  suitable for including in the scope of this
> JSR shows that there are three basic categories.  These categories are  1)
> dumps that contain process information, 2) Dumps that contain information
> about a Java runtime and 3) dumps that are limited to the contents of a
> Java  Heap.  Generally these dumps are inclusive in the sense that, for
> instance, a process dump normally contains a Java Runtime and it in turn
> contains information about the contents of the Java heap.   The inverse of
> this is not true.  This categorisation is used in this document and will be
> used to  help structure the development of the JSR. The categorisation
> should  not be assumed to be set in stone.
>   
Another possibility, I suppose: groveling around in an OS crash dump, at 
least to iterate through the Java processes.  Perhaps a future addition?
>
> 5.0: Sample Tools – the primary drivers for developing  JSR User Stories
>
>
>
> 5.1: Process Explorer
>
> An Eclipse plugin which allows presentation, navigation and simple querying
> of the elements of a Process Dump.   This tool will demonstrate how to
> explore a dump in an efficient and high performing manner.   Key
> characteristics will include  fast startup time, handling of large
> quantities of data (including summarization), effective navigation to areas
> of interest.
>   
Don't you think it would be better to do this for NetBeans?  :^)  :^)
> 5.2: Native Memory Analyser Tool
>
> A program that can retrieve native memory allocations by (or on behalf of) a
> Java Runtime and provide trace back to the  Java objects that hold the
> allocation reference.   The tool will be able to display what memory
> allocations exist, the contents of the allocation, and conservatively
> identify which entities hold references to that allocation.  Ideally this
> tool will be able to point to specific fields within Java objects that hold
> the references.   This tool will demonstrate the capabilities   of the API
> to find and display native memory  from a memory manager.    Key
> characteristics will include the performance of the API in  exhaustively
> scanning a dump (for memory allocation handles) and the ability to resolve
> an arbitrary location within the dump into a Java object or similar entity
>   
Let's take the 10,000-foot view for a second.  When you said "user 
stories," I thought you meant something more along the lines of "use 
cases."  A use case would back up from the solution a bit to a person 
with a problem, e.g.

    /A developer is trying to find out why her Java program occasionally
    holds on to several hundred megabytes of virtual memory for a
    while.  So far she has learned that a large part of this is not in
    the heap.  Information that would help her solve the problem include:
    /

        * /How many native memory allocations there are of each size./
        * /Where/when the allocations were done./
        * /What objects in the heap hold references to these allocations
          (and recursively which objects hold references to those).
          /

    /The developer has managed to reproduce the problem while using
    dmalloc <http://sourceforge.net/projects/dmalloc>, so the
    information is available.../

That is, begin with the problem and work forward to the solution.  In 
the case of the process explorer, what is going on when somebody is 
using it?  Is it a sustaining engineer looking at a customer's Java core 
dump?  What is it they need to know/do in order to solve their problem, 
and *then* voila -- introduce the process explorer, explain what it 
allows them to do that they couldn't do with other tools, and how that 
helps them with their problem.

Certainly the NMAT is an excellent use of the API...
> 5.3: Java Runtime Explorer
>
> Similar to the Process Explorer above this Eclipse plugin will allow the
> presentation, navigation and  simple querying of the elements of a Java
> Runtime dump. This tool will demonstrate how to explore a Java runtime dump
> in an efficient and high performing manner.   Ideally the plugin will also
> demonstrate the APIs ability to  provide some support for  virtualisation of
> Java runtime objects so that implementation specifics concerning objects
> within  the java.lang and java.util packages can be hidden.   Key
> characteristics will include  fast startup time, handling of large
> quantities of data (including summarization), effective navigation to areas
> of interest, useful abstraction of key Java object implementation specifics
>   
Do you envision these as separate tools, or as separate functions of a 
single tool?
> 5.4:  Runtime Investigator
>
> A program that can examine a dump and provide guidance on common aliments.
> This tool will provide analysis modules that can report on such items as
> deadlock analysis,  heap occupancy etc.  The tool will provide extension
> points that will allow others to contribute new analysis modules.  Key
> characteristics of this tool will include handling large quantities of data
> efficiently  (probably via a query language of some type) , ensuring the API
> is generally consumable by programmers and ensuring the API provides the
> data that is actually required to analyze real problems.
>   
I wonder whether Derby could be used to provide a SQL front end to 
relational representations of what is in the heap.  Perhaps the output 
of an analysis module could be a set of relations, and then you can use 
SQL on that; something like this:

     > run BasicAnalysis
    Creating tables:
      HEAP_OBJECTS
      CLASSES
      CLASSLOADERS
      ...

     > SELECT  round( sum(class.obj_size)/1024 ) as kbytes,
              sum(1) as num_objs,
              class.name
        FROM  heap_objects obj,
        JOIN  classes class
          ON  obj.class_id = class.id
       WHERE  class.classloader_id IS NOT NULL
       GROUP  BY class.name
       ORDER  BY kbytes descending
    ;

      KBYTES NUM_OBJ CLASS           
    -------- ------- ------------------------------
       83544    9215 Foo
       17219   21484 Bar
        7482    3872 Zot
         836    8639 Woz
         ...

Hopefully that isn't too ridiculous an example.
> 5.5: Java Runtime Trend Analyzer
>
> A tool that can compare multiple dumps and provide trend analysis.  This
> tool will provide analysis modules that can report on such items as  heap
> growth etc The tool will provide extension points that will allow others to
> contribute new analysis modules.  Key characteristics of this tool will
> include exercising  the creation of well formed dumps,  fast startup time,
> correlation between dump objects and handling large quantities of data
> efficiently  (probably via a query language of some type) , ensuring the API
> is generally consumable by programmers and ensuring the API provides the
> data that is actually required to analyze real problems.
>   
If you continue in the same Derby theme as above, focusing on one dump 
at a time and running some analysis and producing some tables each time, 
then you could use SQL to compare the results, e.g.

    SELECT  ...
    FROM (
        SELECT <stuff from dump 2 tables>
      MINUS ALL
        SELECT <stuff from dump 1 tables>
    ) new_stuff

Am I going off the deep end here?  :^)  Maybe I've been doing too much 
database work recently; everything is starting to look like a relation 
to me...

Notice that with this approach you could actually compare the results of 
running the same app on two different VMs.  There would be lots of 
differences between the two runtimes, but who cares -- the analysis 
modules would extract only the parts that are relevant to the comparison.
>
> 5.6: Java Debug Interface (JDI) Connector
>
> An adapter that allows a Java debugger to interrogate the contents of a Java
> Runtime diagnostic artifact.  This connector will enable similar
> capabilities that exist today with other debuggers than can debug corefiles
> or similar process diagnostic artifacts.   This tool will demonstrate  key
> characteristics such as effective navigation to areas of interest, useful
> abstraction of key Java object implementation specifics and that the API
> provides  the data that required to analyze real problems.
>   
Effectively a vendor-independent version of what you can do with SA (and 
probably IBM's equivalent) now.
>
> 5.7: Memory Analyser Tool (MAT) Adapter
>
> MAT (http://www.eclipse.org/mat/) is an open source project that consumes
> HPROF and DTFJ supported dumps.  MAT is designed to help find memory leaks
> and reduce memory consumption.  An adapter for MAT will be developed that
> allows MAT to consume HPROF and other dumps via the JSR 326 API.   Key
> characteristics of this adapter will include  handling large quantities of
> data efficiently,  useful abstraction of key Java object implementation
> specifics and  dump type identification.
>
>
>
> 6.0: Reference Implementation Scope
>
> The Reference Implementation will not create implementations for all  JVMs
> or diagnostic artifacts.   The scope of the project  is to only encompass
> the open , public and most used combinations.     The initial proposal for
> the API defines three separate categories of diagnostic artifact.  The
> Reference Implementation will be developed to consume the following
> diagnostic artifacts from those categories
>
> 6.1: Process Level Diagnostic Artifacts
>
> Operating System  / Diagnostic Artifact
> Linux Ubuntu 8.10 x86   : ELF format Process Dump (1)
> Microsoft Windows XP   : Microsoft userdump (2)
> IBM AIX 6.1                  : AIX corefile (3)
>
> (1) ELF Format is a publically available format described in many places –
> usually starting with elf.h!
>
> (2) Microsoft userdumps are in minidump format. Description starts here
> http://msdn.microsoft.com/en-us/library/ms680378(VS.85).aspx
>
> (3) IBM AIX corefile format is publically available
> http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.files/doc/aixfiles/core.htm
>
>
> 6.2: Java Runtime Diagnostic Artifacts
>
>
> JVM / Diagnostic Artifact
> Sun Linux/Windows OpenJDK 6.0 JRE    : HPROF Binary format
> Sun Windows OpenJDK 6.0 JRE             : Microsoft userdump
> Sun Linux x86 OpenJDK 6.0 JRE            : ELF format Process Dump
> Sun Linux/Windows OpenJDK 6.0 JRE    : Serviceability Agent API (1)
> Sun  Linux/Windows Java 1.4.2_19 JRE  : HPROF Binary format (2)
>
>
>
> (1) Assumes classpath exception for API can be granted by Sun Microsystems
> (2) Sun java 1.4.2_19 JRE support will be on a best can do basis since
> information about the internal structures of the JRE is not publically
> available.  In the event that critical information is required then we will
> ask Sun Microsystems for help and request they publish the information.
>
>
> 6.3: Java Heap Diagnostic Artifacts
>
> The Java Heap category of API is effectively a subset of the JavaRuntime API
> and thus the list below is the same as above.
>
> JVM / Diagnostic Artifact
> Sun Linux/Windows OpenJDK 6.0 JRE    : HPROF Binary format
> Sun Windows OpenJDK 6.0 JRE             : Microsoft userdump
> Sun Linux x86 OpenJDK 6.0 JRE            : ELF format Process Dump
> Sun Linux/Windows OpenJDK 6.0 JRE    : Serviceability Agent API
> Sun  Linux/Windows Java 1.4.2_19 JRE  : HPROF Binary format
>
>
> 6.4: Other dump formats and implementations
>
> During this project IBM  will  be producing prototype implementations of the
> API  for some subset of  IBM JREs and dump formats.  This will provide
> welcome feedback on the API and allow early adopters a broader set of
> environments to work with.
>
>
> 7.0  Closing the seed contribution short fall.
>
> 8.0  Timescales,  schedule, milestones
>
>   

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message