incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Poole" <spoole...@googlemail.com>
Subject Re: JSR 326: Post mortem JVM Diagnostics API - Developing the Specification
Date Mon, 05 Jan 2009 23:09:14 GMT
On Thu, Jan 1, 2009 at 10:44 PM, Nicholas Sterling <
Nicholas.Sterling@sun.com> wrote:

>
> Hi, Steve.  Thought I'd use the free time over the holidays to finally
> respond to you!  :^)
>
> First of all, this is great -- I like the idea of working from such ideas,
> and certainly appreciate all the work that you are putting into it.
>  Comments inline.
>
> Nicholas
>
>
>
>
> Steve Poole wrote:
>
>> Greetings
>>
>> I had intended to post this document as a wiki page but we don't have a
>> wiki
>> yet!
>>
>> The following document is a work-in-progress.  This document is ultimately
>> intended to capture the approach , process and scope for the development
>> of
>> this specification.    This is very much an initial brain dump so please
>> feel free to point out any inaccuracies omissions, trade mark violations
>> etc!
>>
>> I particular would like to receive feedback on the list of proposed sample
>> tools.  The  tools are proposed as examples that could be developed to
>> demonstrate the validity of the specification.  Its likely that not all of
>> these tools are necessary or doable in the timescales of the first
>> release.
>>
>> EG members - please let me know one way or another if you consider the
>> list
>> to be acceptable  and descriptive enough for us to start expanding into
>> more
>> detailed user stories.  Note we may not end up actually producing  all of
>> these tools - but that should not stop us as Specification designers from
>> defining the necessary user stories.
>>
>> Thanks
>>
>> Stebe
>>
>>
>> ---------------------------------------
>>
>> JSR 326: Post mortem JVM Diagnostics API -  Developing the Specification
>>
>> Version Info
>>
>> Initial : Steve Poole  12 Dec 2008
>>
>> 1.0: JSR Objectives
>>
>> Define a standard Java API  to support the generation and consumption of
>> post mortem or snapshot Java diagnostic artefacts. The specification will
>> be
>> inclusive of a range of existing "in field" diagnostic artefacts:
>> including
>> common operating system dump formats. The specification will balance the
>> need to provide maximum value from existing artefacts, while considering
>> the
>> problem space expected to be encountered in the near and longer term
>> future,
>> with multiple language environments and very large heaps
>>
>>
>> 2.0: Approach
>>
>> The design of the API will be driven directly by user stories.    To
>> ensure
>> coherence between user stories the stories will themselves be developed as
>> requirements on several sample tools.  The project does not seek to create
>> state of the art tools but recognises that having useful and useable
>> sample
>> tools is crucial in  demonstrating the validity of the API and will
>> encourage others to build  alternative "better mouse-traps". These
>> examples
>> tools will also help define the  non-functional characteristics that are
>> not
>> easily translated into user stories -  characteristics such as
>> scalability,
>> performance, tracing etc.
>>
>> These tools and the  embodiment of the JSR specification: i.e. the
>> reference implementation (RI) and test compliance kit (TCK),  are being
>> developed as an Apache Software Foundation Incubator. The JSR Expert Group
>> (EG) and the Reference Implementation developers will work together to
>> define, develop and refine the specification etc.  The specification is
>> intended to be incrementally developed and will always be available via
>> the
>> RI API Javadoc.   As the JSR moves though its various stages the
>> specification at that point will be declared by referring to a publically
>> visible form of the Javadoc and the associated repository revision.
>>
>>
>> 3.0: Initial starting point.
>>
>> IBM is contributing  non proprietary portions of its  Diagnostic Tool
>> Framework for Java (DTFJ) and associated  tools, samples , documentation
>> and
>> testcases.  This contribution is  only a seed. The JSR EG must review and
>> amend this API as necessary to meet its requirements.   EG members can
>> also
>> contribute directly to the specification by providing testcases or  code
>> samples etc.
>>
>>
>> 4.0: API Structure
>>
>> Analysis of the types of dump  suitable for including in the scope of this
>> JSR shows that there are three basic categories.  These categories are  1)
>> dumps that contain process information, 2) Dumps that contain information
>> about a Java runtime and 3) dumps that are limited to the contents of a
>> Java  Heap.  Generally these dumps are inclusive in the sense that, for
>> instance, a process dump normally contains a Java Runtime and it in turn
>> contains information about the contents of the Java heap.   The inverse of
>> this is not true.  This categorisation is used in this document and will
>> be
>> used to  help structure the development of the JSR. The categorisation
>> should  not be assumed to be set in stone.
>>
>>
> Another possibility, I suppose: groveling around in an OS crash dump, at
> least to iterate through the Java processes.  Perhaps a future addition?


Actually - its been pointed out to me that at least on  IBM Z/OS its quite
common to have multiple address spaces and multiple javaruntimes present in
a dump today

>
>
>> 5.0: Sample Tools – the primary drivers for developing  JSR User Stories
>>
>>
>>
>> 5.1: Process Explorer
>>
>> An Eclipse plugin which allows presentation, navigation and simple
>> querying
>> of the elements of a Process Dump.   This tool will demonstrate how to
>> explore a dump in an efficient and high performing manner.   Key
>> characteristics will include  fast startup time, handling of large
>> quantities of data (including summarization), effective navigation to
>> areas
>> of interest.
>>
>>
> Don't you think it would be better to do this for NetBeans?  :^)  :^)


I don't have an axe to grind here :-)  I personally ike Eclipse above
Netbeans and have written quite a few eclipse plugins.  However it would be
good
to have a plugin for another IDE to help validate that our assumptions on
what a ui program looks like are valid.    There is ample oppotunity to
include
a NetBeans plugin but my team doesn't have the expertise to wite it.

Any volunteers?


>  5.2: Native Memory Analyser Tool
>>
>> A program that can retrieve native memory allocations by (or on behalf of)
>> a
>> Java Runtime and provide trace back to the  Java objects that hold the
>> allocation reference.   The tool will be able to display what memory
>> allocations exist, the contents of the allocation, and conservatively
>> identify which entities hold references to that allocation.  Ideally this
>> tool will be able to point to specific fields within Java objects that
>> hold
>> the references.   This tool will demonstrate the capabilities   of the API
>> to find and display native memory  from a memory manager.    Key
>> characteristics will include the performance of the API in  exhaustively
>> scanning a dump (for memory allocation handles) and the ability to resolve
>> an arbitrary location within the dump into a Java object or similar entity
>>
>>
> Let's take the 10,000-foot view for a second.  When you said "user
> stories," I thought you meant something more along the lines of "use cases."
>  A use case would back up from the solution a bit to a person with a
> problem, e.g.
>
>   /A developer is trying to find out why her Java program occasionally
>   holds on to several hundred megabytes of virtual memory for a
>   while.  So far she has learned that a large part of this is not in
>   the heap.  Information that would help her solve the problem include:
>   /
>
>       * /How many native memory allocations there are of each size./
>       * /Where/when the allocations were done./
>       * /What objects in the heap hold references to these allocations
>         (and recursively which objects hold references to those).
>         /
>
>   /The developer has managed to reproduce the problem while using
>   dmalloc <http://sourceforge.net/projects/dmalloc>, so the
>   information is available.../
>
> That is, begin with the problem and work forward to the solution.  In the
> case of the process explorer, what is going on when somebody is using it?
>  Is it a sustaining engineer looking at a customer's Java core dump?  What
> is it they need to know/do in order to solve their problem, and *then* voila
> -- introduce the process explorer, explain what it allows them to do that
> they couldn't do with other tools, and how that helps them with their
> problem.
>
> Certainly the NMAT is an excellent use of the API...
>
>> 5.3: Java Runtime Explorer
>>
>> Similar to the Process Explorer above this Eclipse plugin will allow the
>> presentation, navigation and  simple querying of the elements of a Java
>> Runtime dump. This tool will demonstrate how to explore a Java runtime
>> dump
>> in an efficient and high performing manner.   Ideally the plugin will also
>> demonstrate the APIs ability to  provide some support for  virtualisation
>> of
>> Java runtime objects so that implementation specifics concerning objects
>> within  the java.lang and java.util packages can be hidden.   Key
>> characteristics will include  fast startup time, handling of large
>> quantities of data (including summarization), effective navigation to
>> areas
>> of interest, useful abstraction of key Java object implementation
>> specifics
>>
>>
> Do you envision these as separate tools, or as separate functions of a
> single tool?
>

I had looked at them as being parts of the same tool  but to be honest I''m
not yet sure on the scope
and mechanisms required for the "virtualisation"  piece so its possible that
they end up not being related.

>
>  5.4:  Runtime Investigator
>>
>> A program that can examine a dump and provide guidance on common aliments.
>> This tool will provide analysis modules that can report on such items as
>> deadlock analysis,  heap occupancy etc.  The tool will provide extension
>> points that will allow others to contribute new analysis modules.  Key
>> characteristics of this tool will include handling large quantities of
>> data
>> efficiently  (probably via a query language of some type) , ensuring the
>> API
>> is generally consumable by programmers and ensuring the API provides the
>> data that is actually required to analyze real problems.
>>
>>
> I wonder whether Derby could be used to provide a SQL front end to
> relational representations of what is in the heap.  Perhaps the output of an
> analysis module could be a set of relations, and then you can use SQL on
> that; something like this:
>
>    > run BasicAnalysis
>   Creating tables:
>     HEAP_OBJECTS
>     CLASSES
>     CLASSLOADERS
>     ...
>
>    > SELECT  round( sum(class.obj_size)/1024 ) as kbytes,
>             sum(1) as num_objs,
>             class.name
>       FROM  heap_objects obj,
>       JOIN  classes class
>         ON  obj.class_id = class.id
>      WHERE  class.classloader_id IS NOT NULL
>      GROUP  BY class.name
>      ORDER  BY kbytes descending
>   ;
>
>     KBYTES NUM_OBJ CLASS             -------- -------
> ------------------------------
>      83544    9215 Foo
>      17219   21484 Bar
>       7482    3872 Zot
>        836    8639 Woz
>        ...
>
> Hopefully that isn't too ridiculous an example.
>

Its somthing I had considered before but when you look into it the types of
query that are likely to be used don't map well to sql.  You gets things
like needing to visit array contents or  asking for an objects parents or if
it implements a particular interface etc.    I was leaning towards something
more
like  OQL or even JXPATH.   The Eclipse Memory Analyser Tool folks use  OQL
and the HPROF tool in Java 6 does too.    I don't want to rush into
selection since it could easily bite us later though there's no reason why
we can explore some the more obvious choices.



>
>  5.5: Java Runtime Trend Analyzer
>>
>> A tool that can compare multiple dumps and provide trend analysis.  This
>> tool will provide analysis modules that can report on such items as  heap
>> growth etc The tool will provide extension points that will allow others
>> to
>> contribute new analysis modules.  Key characteristics of this tool will
>> include exercising  the creation of well formed dumps,  fast startup time,
>> correlation between dump objects and handling large quantities of data
>> efficiently  (probably via a query language of some type) , ensuring the
>> API
>> is generally consumable by programmers and ensuring the API provides the
>> data that is actually required to analyze real problems.
>>
>>
> If you continue in the same Derby theme as above, focusing on one dump at a
> time and running some analysis and producing some tables each time, then you
> could use SQL to compare the results, e.g.
>
>   SELECT  ...
>   FROM (
>       SELECT <stuff from dump 2 tables>
>     MINUS ALL
>       SELECT <stuff from dump 1 tables>
>   ) new_stuff
>
> Am I going off the deep end here?  :^)  Maybe I've been doing too much
> database work recently; everything is starting to look like a relation to
> me...
>

Ah - ok now I see why you are thinking about databases.

>
> Notice that with this approach you could actually compare the results of
> running the same app on two different VMs.  There would be lots of
> differences between the two runtimes, but who cares -- the analysis modules
> would extract only the parts that are relevant to the comparison.
>

Yes understand.   I wonder if we could do the same with the other query
languages.    My thought was that it would be too expensive to dump a dump
into a database and then query it -  I expected  that the query support
would need to be built "under the covers"  into the API so that we could
have it as close to the data as possible and allow implementors the
oppotunity for parrellisation and other optimisations.

Lets discuss!



>
>> 5.6: Java Debug Interface (JDI) Connector
>>
>> An adapter that allows a Java debugger to interrogate the contents of a
>> Java
>> Runtime diagnostic artifact.  This connector will enable similar
>> capabilities that exist today with other debuggers than can debug
>> corefiles
>> or similar process diagnostic artifacts.   This tool will demonstrate  key
>> characteristics such as effective navigation to areas of interest, useful
>> abstraction of key Java object implementation specifics and that the API
>> provides  the data that required to analyze real problems.
>>
>>
> Effectively a vendor-independent version of what you can do with SA (and
> probably IBM's equivalent) now.


Yep


>
>> 5.7: Memory Analyser Tool (MAT) Adapter
>>
>> MAT (http://www.eclipse.org/mat/) is an open source project that consumes
>> HPROF and DTFJ supported dumps.  MAT is designed to help find memory leaks
>> and reduce memory consumption.  An adapter for MAT will be developed that
>> allows MAT to consume HPROF and other dumps via the JSR 326 API.   Key
>> characteristics of this adapter will include  handling large quantities of
>> data efficiently,  useful abstraction of key Java object implementation
>> specifics and  dump type identification.
>>
>>
>>
>> 6.0: Reference Implementation Scope
>>
>> The Reference Implementation will not create implementations for all  JVMs
>> or diagnostic artifacts.   The scope of the project  is to only encompass
>> the open , public and most used combinations.     The initial proposal for
>> the API defines three separate categories of diagnostic artifact.  The
>> Reference Implementation will be developed to consume the following
>> diagnostic artifacts from those categories
>>
>> 6.1: Process Level Diagnostic Artifacts
>>
>> Operating System  / Diagnostic Artifact
>> Linux Ubuntu 8.10 x86   : ELF format Process Dump (1)
>> Microsoft Windows XP   : Microsoft userdump (2)
>> IBM AIX 6.1                  : AIX corefile (3)
>>
>> (1) ELF Format is a publically available format described in many places –
>> usually starting with elf.h!
>>
>> (2) Microsoft userdumps are in minidump format. Description starts here
>> http://msdn.microsoft.com/en-us/library/ms680378(VS.85).aspx<http://msdn.microsoft.com/en-us/library/ms680378%28VS.85%29.aspx>
>>
>> (3) IBM AIX corefile format is publically available
>>
>> http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.files/doc/aixfiles/core.htm
>>
>>
>> 6.2: Java Runtime Diagnostic Artifacts
>>
>>
>> JVM / Diagnostic Artifact
>> Sun Linux/Windows OpenJDK 6.0 JRE    : HPROF Binary format
>> Sun Windows OpenJDK 6.0 JRE             : Microsoft userdump
>> Sun Linux x86 OpenJDK 6.0 JRE            : ELF format Process Dump
>> Sun Linux/Windows OpenJDK 6.0 JRE    : Serviceability Agent API (1)
>> Sun  Linux/Windows Java 1.4.2_19 JRE  : HPROF Binary format (2)
>>
>>
>>
>> (1) Assumes classpath exception for API can be granted by Sun Microsystems
>> (2) Sun java 1.4.2_19 JRE support will be on a best can do basis since
>> information about the internal structures of the JRE is not publically
>> available.  In the event that critical information is required then we
>> will
>> ask Sun Microsystems for help and request they publish the information.
>>
>>
>> 6.3: Java Heap Diagnostic Artifacts
>>
>> The Java Heap category of API is effectively a subset of the JavaRuntime
>> API
>> and thus the list below is the same as above.
>>
>> JVM / Diagnostic Artifact
>> Sun Linux/Windows OpenJDK 6.0 JRE    : HPROF Binary format
>> Sun Windows OpenJDK 6.0 JRE             : Microsoft userdump
>> Sun Linux x86 OpenJDK 6.0 JRE            : ELF format Process Dump
>> Sun Linux/Windows OpenJDK 6.0 JRE    : Serviceability Agent API
>> Sun  Linux/Windows Java 1.4.2_19 JRE  : HPROF Binary format
>>
>>
>> 6.4: Other dump formats and implementations
>>
>> During this project IBM  will  be producing prototype implementations of
>> the
>> API  for some subset of  IBM JREs and dump formats.  This will provide
>> welcome feedback on the API and allow early adopters a broader set of
>> environments to work with.
>>
>>
>> 7.0  Closing the seed contribution short fall.
>>
>> 8.0  Timescales,  schedule, milestones
>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message