incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Poole" <spoole...@googlemail.com>
Subject Re: JSR 326: Post mortem JVM Diagnostics API - Developing the Specification
Date Mon, 05 Jan 2009 22:36:40 GMT
On Mon, Dec 29, 2008 at 1:27 PM, Bobrovsky, Konstantin S <
konstantin.s.bobrovsky@intel.com> wrote:

> Hi Steve, all
>
> The document you sent looks like a great starting point to me, the list of
> possible tools is reasonable as well. Here is couple more suggestions in
> this area:
>
> 1) "Retrospector"
> One of the rationales behind the JSR-326 was complexity of failure &
> performance analysis increasing with CPU core count growth. I believe it
> would be good to have a tool showcasing this rationale among the first
> cohort of the sample tools.
>
> Here is a description of one such tool - let's call it "retrospector" for
> now. I admit it might already exist somewhere in the Java universe, but
> anyway it seems to fit well into JSR-326.
>        The idea is to define a number of landmarks or checkpoints in the
> code and see how their mutual temporal ordering and latency change (a) over
> time (b) depending on the number of threads (c) depending on the number of
> available CPU cores (d) ... Checkpoints can be of different kinds:
> - well-known events defined by JVMTI (monitor enter/exits, method
> entry/exits, class loads, code generation, thread lifecycle events,...)
> - new kinds of events helping solve particular problems. For example,
> interpreted -> compiled code execution mode change for a method, reaching a
> safepoint (in Hotspot terms), native memory allocation in from JNI code,
> arbitrary checkpoint specified by a user as <method, bytecode position> pair
> (with optional additional refinements or filtering), etc.
>        When a checkpoint is reached by a thread, the JRE internally quickly
> logs this fact by adding a record <checkpoint ID, thread ID, timestamp, pc,
> ...> into a buffer, which, at appropriate time, is processed and made
> available via the JSR-326 API. [Note: The logging should be extremely
> efficient to minimize the "observer effect", which seems quite possible with
> thread-local checkpoint event buffer.].
>        The retrospector tool will be able to load the 2 or more checkpoint
> logs and
> - split up each of them into per-thread events
> - visualize the checkpoint events mapped onto the timescale
> - visualize the difference in temporal behavior between two threads within
> the same run of the analyzed app, between two threads within different runs
> on the analyzed app
> - ...
>        This will greatly assist in temporal behavior analysis of an
> application, its scalability.
>
> Another tool which can be easily implemented based the checkpoint traces is
> a deadlock analyzer (most likely, already existing, but just to show
> applicability of this kind of data) provided that monitor state change
> checkpoints have been recorded. The tool can analyze whether there are
> sequences of different monitor acquisitions by any two threads in a
> different order.
>
> It is an important question how checkpoints can be implemented - can this
> be done w/o modification of the JVM or not. For some kinds of checkpoints it
> can be done via bytecode instrumentation, for others - like internal VM lock
> enter - can not.
>

Hi  Konst,   I can see what you want -   Its quite possible that we could
have a portion of the API that reports this sort of  data  but the data
needs to exist.  We would need to create a reference example of the code
that would gather the data

I think this is a cool idea and we need to have this tool on the list :-)

>
> 2) "Debugger with memory".
> This is a small addition to the runtime, which, for each thread, logs
> information about last N branch targets leading to current PC (regardless of
> whether branch targets an the PC belong to the generated or JVM code).
> JSR-326-enabled debugger could expose this data to the user to ease
> debugging. In my own experience, the need in such information arises pretty
> regularly when investigating crashes.
>

I'm interested in this idea but not quite sure how well this would work in
practice.   Its sounds like we should be listing the sorts of data that a
JVM/JIT developer  needs to debug their problems -  we'd have to see what
sort of commonality  we get across the various jvm vendors..  Want to list
you top requirements?

>
>
> Also, JFYI, there is one more interesting tool, which I don't recall being
> mentioned - JRockit's "flight recorder" technology:
> http://edocs.bea.com/jrockit/geninfo/diagnos/intromiscon.html
> Maybe it can help suggest some more tools ideas JSR-326 can cover.
>
> Thanks,
> Konst
>
> Closed Joint Stock Company Intel A/O
> Registered legal address: Krylatsky Hills Business Park,
> 17 Krylatskaya Str., Bldg 4, Moscow 121614,
> Russian Federation
>
>
> -----Original Message-----
> From: Steve Poole [mailto:spoole167@googlemail.com]
> Sent: Friday, December 12, 2008 9:43 PM
> To: kato-spec@incubator.apache.org
> Subject: JSR 326: Post mortem JVM Diagnostics API - Developing the
> Specification
>
> Greetings
>
> I had intended to post this document as a wiki page but we don't have a
> wiki
> yet!
>
> The following document is a work-in-progress.  This document is ultimately
> intended to capture the approach , process and scope for the development of
> this specification.    This is very much an initial brain dump so please
> feel free to point out any inaccuracies omissions, trade mark violations
> etc!
>
> I particular would like to receive feedback on the list of proposed sample
> tools.  The  tools are proposed as examples that could be developed to
> demonstrate the validity of the specification.  Its likely that not all of
> these tools are necessary or doable in the timescales of the first release.
>
> EG members - please let me know one way or another if you consider the list
> to be acceptable  and descriptive enough for us to start expanding into
> more
> detailed user stories.  Note we may not end up actually producing  all of
> these tools - but that should not stop us as Specification designers from
> defining the necessary user stories.
>
> Thanks
>
> Stebe
>
>
> ---------------------------------------
>
> JSR 326: Post mortem JVM Diagnostics API -  Developing the Specification
>
> Version Info
>
> Initial : Steve Poole  12 Dec 2008
>
> 1.0: JSR Objectives
>
> Define a standard Java API  to support the generation and consumption of
> post mortem or snapshot Java diagnostic artefacts. The specification will
> be
> inclusive of a range of existing "in field" diagnostic artefacts: including
> common operating system dump formats. The specification will balance the
> need to provide maximum value from existing artefacts, while considering
> the
> problem space expected to be encountered in the near and longer term
> future,
> with multiple language environments and very large heaps
>
>
> 2.0: Approach
>
> The design of the API will be driven directly by user stories.    To ensure
> coherence between user stories the stories will themselves be developed as
> requirements on several sample tools.  The project does not seek to create
> state of the art tools but recognises that having useful and useable sample
> tools is crucial in  demonstrating the validity of the API and will
> encourage others to build  alternative "better mouse-traps". These examples
> tools will also help define the  non-functional characteristics that are
> not
> easily translated into user stories -  characteristics such as scalability,
> performance, tracing etc.
>
> These tools and the  embodiment of the JSR specification: i.e. the
> reference implementation (RI) and test compliance kit (TCK),  are being
> developed as an Apache Software Foundation Incubator. The JSR Expert Group
> (EG) and the Reference Implementation developers will work together to
> define, develop and refine the specification etc.  The specification is
> intended to be incrementally developed and will always be available via the
> RI API Javadoc.   As the JSR moves though its various stages the
> specification at that point will be declared by referring to a publically
> visible form of the Javadoc and the associated repository revision.
>
>
> 3.0: Initial starting point.
>
> IBM is contributing  non proprietary portions of its  Diagnostic Tool
> Framework for Java (DTFJ) and associated  tools, samples , documentation
> and
> testcases.  This contribution is  only a seed. The JSR EG must review and
> amend this API as necessary to meet its requirements.   EG members can also
> contribute directly to the specification by providing testcases or  code
> samples etc.
>
>
> 4.0: API Structure
>
> Analysis of the types of dump  suitable for including in the scope of this
> JSR shows that there are three basic categories.  These categories are  1)
> dumps that contain process information, 2) Dumps that contain information
> about a Java runtime and 3) dumps that are limited to the contents of a
> Java  Heap.  Generally these dumps are inclusive in the sense that, for
> instance, a process dump normally contains a Java Runtime and it in turn
> contains information about the contents of the Java heap.   The inverse of
> this is not true.  This categorisation is used in this document and will be
> used to  help structure the development of the JSR. The categorisation
> should  not be assumed to be set in stone.
>
>
> 5.0: Sample Tools - the primary drivers for developing  JSR User Stories
>
>
>
> 5.1: Process Explorer
>
> An Eclipse plugin which allows presentation, navigation and simple querying
> of the elements of a Process Dump.   This tool will demonstrate how to
> explore a dump in an efficient and high performing manner.   Key
> characteristics will include  fast startup time, handling of large
> quantities of data (including summarization), effective navigation to areas
> of interest.
>
> 5.2: Native Memory Analyser Tool
>
> A program that can retrieve native memory allocations by (or on behalf of)
> a
> Java Runtime and provide trace back to the  Java objects that hold the
> allocation reference.   The tool will be able to display what memory
> allocations exist, the contents of the allocation, and conservatively
> identify which entities hold references to that allocation.  Ideally this
> tool will be able to point to specific fields within Java objects that hold
> the references.   This tool will demonstrate the capabilities   of the API
> to find and display native memory  from a memory manager.    Key
> characteristics will include the performance of the API in  exhaustively
> scanning a dump (for memory allocation handles) and the ability to resolve
> an arbitrary location within the dump into a Java object or similar entity
>
> 5.3: Java Runtime Explorer
>
> Similar to the Process Explorer above this Eclipse plugin will allow the
> presentation, navigation and  simple querying of the elements of a Java
> Runtime dump. This tool will demonstrate how to explore a Java runtime dump
> in an efficient and high performing manner.   Ideally the plugin will also
> demonstrate the APIs ability to  provide some support for  virtualisation
> of
> Java runtime objects so that implementation specifics concerning objects
> within  the java.lang and java.util packages can be hidden.   Key
> characteristics will include  fast startup time, handling of large
> quantities of data (including summarization), effective navigation to areas
> of interest, useful abstraction of key Java object implementation specifics
>
> 5.4:  Runtime Investigator
>
> A program that can examine a dump and provide guidance on common aliments.
> This tool will provide analysis modules that can report on such items as
> deadlock analysis,  heap occupancy etc The tool will provide extension
> points that will allow others to contribute new analysis modules.  Key
> characteristics of this tool will include handling large quantities of data
> efficiently  (probably via a query language of some type) , ensuring the
> API
> is generally consumable by programmers and ensuring the API provides the
> data that is actually required to analyze real problems.
>
> 5.5: Java Runtime Trend Analyzer
>
> A tool that can compare multiple dumps and provide trend analysis.  This
> tool will provide analysis modules that can report on such items as  heap
> growth etc The tool will provide extension points that will allow others to
> contribute new analysis modules.  Key characteristics of this tool will
> include exercising  the creation of well formed dumps,  fast startup time,
> correlation between dump objects and handling large quantities of data
> efficiently  (probably via a query language of some type) , ensuring the
> API
> is generally consumable by programmers and ensuring the API provides the
> data that is actually required to analyze real problems.
>
>
> 5.6: Java Debug Interface (JDI) Connector
>
> An adapter that allows a Java debugger to interrogate the contents of a
> Java
> Runtime diagnostic artifact.  This connector will enable similar
> capabilities that exist today with other debuggers than can debug corefiles
> or similar process diagnostic artifacts.   This tool will demonstrate  key
> characteristics such as effective navigation to areas of interest, useful
> abstraction of key Java object implementation specifics and that the API
> provides  the data that required to analyze real problems.
>
>
> 5.7: Memory Analyser Tool (MAT) Adapter
>
> MAT (http://www.eclipse.org/mat/) is an open source project that consumes
> HPROF and DTFJ supported dumps.  MAT is designed to help find memory leaks
> and reduce memory consumption.  An adapter for MAT will be developed that
> allows MAT to consume HPROF and other dumps via the JSR 326 API.   Key
> characteristics of this adapter will include  handling large quantities of
> data efficiently,  useful abstraction of key Java object implementation
> specifics and  dump type identification.
>
>
>
> 6.0: Reference Implementation Scope
>
> The Reference Implementation will not create implementations for all  JVMs
> or diagnostic artifacts.   The scope of the project  is to only encompass
> the open , public and most used combinations.     The initial proposal for
> the API defines three separate categories of diagnostic artifact.  The
> Reference Implementation will be developed to consume the following
> diagnostic artifacts from those categories
>
> 6.1: Process Level Diagnostic Artifacts
>
> Operating System  / Diagnostic Artifact
> Linux Ubuntu 8.10 x86   : ELF format Process Dump (1)
> Microsoft Windows XP   : Microsoft userdump (2)
> IBM AIX 6.1                  : AIX corefile (3)
>
> (1) ELF Format is a publically available format described in many places -
> usually starting with elf.h!
>
> (2) Microsoft userdumps are in minidump format. Description starts here
> http://msdn.microsoft.com/en-us/library/ms680378(VS.85).aspx<http://msdn.microsoft.com/en-us/library/ms680378%28VS.85%29.aspx>
>
> (3) IBM AIX corefile format is publically available
>
> http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.files/doc/aixfiles/core.htm
>
>
> 6.2: Java Runtime Diagnostic Artifacts
>
>
> JVM / Diagnostic Artifact
> Sun Linux/Windows OpenJDK 6.0 JRE    : HPROF Binary format
> Sun Windows OpenJDK 6.0 JRE             : Microsoft userdump
> Sun Linux x86 OpenJDK 6.0 JRE            : ELF format Process Dump
> Sun Linux/Windows OpenJDK 6.0 JRE    : Serviceability Agent API (1)
> Sun  Linux/Windows Java 1.4.2_19 JRE  : HPROF Binary format (2)
>
>
>
> (1) Assumes classpath exception for API can be granted by Sun Microsystems
> (2) Sun java 1.4.2_19 JRE support will be on a best can do basis since
> information about the internal structures of the JRE is not publically
> available.  In the event that critical information is required then we will
> ask Sun Microsystems for help and request they publish the information.
>
>
> 6.3: Java Heap Diagnostic Artifacts
>
> The Java Heap category of API is effectively a subset of the JavaRuntime
> API
> and thus the list below is the same as above.
>
> JVM / Diagnostic Artifact
> Sun Linux/Windows OpenJDK 6.0 JRE    : HPROF Binary format
> Sun Windows OpenJDK 6.0 JRE             : Microsoft userdump
> Sun Linux x86 OpenJDK 6.0 JRE            : ELF format Process Dump
> Sun Linux/Windows OpenJDK 6.0 JRE    : Serviceability Agent API
> Sun  Linux/Windows Java 1.4.2_19 JRE  : HPROF Binary format
>
>
> 6.4: Other dump formats and implementations
>
> During this project IBM  will  be producing prototype implementations of
> the
> API  for some subset of  IBM JREs and dump formats.  This will provide
> welcome feedback on the API and allow early adopters a broader set of
> environments to work with.
>
>
> 7.0  Closing the seed contribution short fall.
>
> 8.0  Timescales,  schedule, milestones
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message