incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Poole <spoole...@googlemail.com>
Subject Re: "Snapshot" support
Date Wed, 04 Nov 2009 17:05:34 GMT
Sorry Andreas - been out on vacation.    Made a quick reply below and I will
do a more detailed response later.


On Wed, Nov 4, 2009 at 11:46 AM, Andreas Grabner <
andreas.grabner@dynatrace.com> wrote:

> Hi Steve - following up on my email I've sent out two weeks ago. See my
> answers below starting with "AG:"
>
> -----Original Message-----
> From: Andreas Grabner
> Sent: Mittwoch, 21. Oktober 2009 10:41
> To: kato-spec@incubator.apache.org
> Subject: RE: "Snapshot" support
>
> Hi Steve - find my answers below (lines starting with "AG:")
>
> Let me know how you want to split up this thread as we are discussing
> multiple topics here
>
> thanks
>
> -----Original Message-----
> From: Steve Poole [mailto:spoole167@googlemail.com]
> Sent: Samstag, 17. Oktober 2009 08:11
> To: kato-spec@incubator.apache.org
> Subject: Re: "Snapshot" support
>
> On Wed, Oct 14, 2009 at 2:55 PM, Andreas Grabner <
> andreas.grabner@dynatrace.com> wrote:
>
> > Steve
> >
> > Thanks Andreas -  this is good stuff.  Questions below.
>
> I propose we continue to  discuss  on this thread but with the aim of
> pulling out the top level items as we go. For instance you've obviously
> got
> more requirements on JVMTI and we should pull that out as a separate
> thread.    I'll do that once you've answered a few of my questions
> below.
>
> Thanks  again
>
>
> >
> > I am following up on Alois' email with some uses cases that we have
> with
> > our clients. Based on those use cases we also derived requirements.
> >
> >
> >
> > Use Case: Really Large Memory Dumps don't Scale
> >
> > Most of our enterprise customers run their applications on 64Bit
> Systems
> > with JVM's having > 1.5GB Heap Space.
> >
> >
> Do you have info  on what the latest heap size  is that you've
> encountered?
>
> AG: We have seen heap sizes of 8GB
>
> Iterating through Java Objects on the Heap doesn't scale with growing
> > Heap Sizes. Due to the object tagging that creates a tag for every
> > object on the heap we quickly exhaust the native memory.
> >
> > Do you have to tag all objects?
>
> AG: Our Memory Snapshot feature visualizes the referral tree of objects.
> I believe that for this we need to create tags for each individual
> object in order to get the reference information - unless there is
> another way of walking the referrer tree that we are not aware of??
>
> For the RI and the JVMTI based dump we've sidestepped the problem by
working from the threads and their object references
That means we don't have to use the tagging to find objects but it does mean
you have to keep  track of ones you've seen
before.



> > Using the current JVMTI/PI APIs doesn't allow us to iterate over the
> > heap for large heaps in a timely manner or without running into memory
> > issues -> Large Memory Dumps are often not possible!!
> >
>
> > Use Case: Provide full context information in OutOfMemory situation
> >
> > Capturing dump information in case of an OutOfMemory exception is key
> to
> > understand the root cause of this event.
> >
> > Access to the JVMTI interfaces to iterate the objects on the heap is
> not
> > possible at this point in time which makes it impossible to collect
> heap
> > information in the same way as when creating a dump during "normal"
> > runtime execution
> >
> > Therefore no detailed memory dumps can be made in the eventhandler of
> an
> > OutOfMemory exception!!
> >
>
> I don't understand this -   the Eclipse MAT tool does detailed OOM
> analysis
> and that uses the HPROF file for Sun JVMs and other dumps for IBM JVMs -
> do
> you  extra requirements beyond what MAT can offer?
>
> AG: In the next use case I explain why we prefer to not use dump files
> for analysis but rather have a "central approach" where our agent (that
> sits in the JVM) can grab all the information needed to perform OOM
> Analysis. In large distributed environments its not feasible to start
> collecting log files from different servers. With our agent technology
> we can collect this data from within the JVM and send it off to our
> central server that manages all JVMs in the System
>
>
> >
> >
> > Use Case: Central Management of Memory Dumps in Distributed Enterprise
> > Applications
> >
> > Most of our enterprise customers run their distributed applications on
> > multiple servers hosting multiple JVM's. Creating and analyzing memory
> > dumps in these distributed scenarios must be central manageable.
> >
> > Creating dump files and storing them on the server machines is not a
> > perfect solution because
> >
> > a)       Getting access to dump files on servers is often restricted
> by
> > security policies
> >
> > b)       Local Disk Space required
> >
> > Therefore a dump file approach is not an option for most of our
> > customers!!
> >
> >  Understood.
>
> >
> >
> >
> > Requirements based on use cases
> >
> > *       Limit the native memory usage when iterating through objects
> >
> >        *       Eliminate the need for an additional object tag
> > requiring native memory -> this can exhaust native memory when having
> > millions of objects
> >        *       Instead of using the Tag use Object Ptr
> >        *       Must ensure that the ObjectPtr stays constant (no
> > objects are moved) throughout the iteration operation
> >
> >
> *       Enable JVMTI Interface access to iterate through Heap Objects in
> > case of Resource Exhaustion (OutOfMemory)
> >
> >        *       Having full access to all Object Heap Interface
> > functions allows us to capture this information in case of an OOM
> >        *       Also have access to JVMTI interfaces for capturing
> stack
> > traces
> >        *       Can some part of this information also be made
> available
> > in terms of a more severe JVM crash?
> >
> > *       Native Interface for memory dump generation
> >
> > Thats an interesting idea -  we were expecting to provide a Java API
> to do
> that .  Having a native version could easily make sense.
>
> AG: Native would be our preference
>
>
> >        *       In order to centrally manage memory dumps we need to be
> > able to do it via a native interface within the JVM
> >
>
> I don't understand why you need to manage dumps via a a native
> interface?
>
> AG: We have an agent that lives in the JVM. This agent sends memory
> information to our central dynaTrace Server. This allows us to do
> central management of all connected JVMs. I mentioned earlier that
> working via dump files doesn't always work with our clients (security
> policies, disk space, ...). As there is JVMTI already - why not extend
> this API? We are also OK with a JavaAPI as long as it works within the
> JVM and not on Dump Files and as long as the performance is not a
> problem (compared to a native implementation)
>
>
hmm  - ok so we do need to be careful about not straying into the world of
tracing.  The JSR is intended to cover static data sets: even if they are
taken very
frequently :-).     The data doesn't have to reside on a dump file -  it
could of course
just be sent down the wire to a remote colection point.

My concern is that at this point specifing additions to JVMTI  to improve
its performance or design is too
early.  I appreciate that you want to see JVMTI improved but we need to move
the discussion up a level and focus  on the  externals (and leave the
implementors the choice of how they make it work)
If the way to make a sensible solution ends up  requiring new or improved
native level APIs then thats fine.


My understanding is that you do the following -

collect data
send it to a collection point
analyse it

(repeat)

Some of the questions we should be asking are

A)   What data is collected and how do you define the criteria
B)   Whats types of analysis takes place and (apart from just accessing the
data sent)  what types of cross collection information is required (I'm
thinking about object corrolation )
C)   How much data and how often is it required

Its these sort of questions that will help shape the API, and help us drive
down to how actually we need to imagine it being implemented.



> >        *       JVMTI would be a perfect candidate assuming the
> existing
> > limitations can be addressed
> >        *       Separate native interface would be an alternative
> option
> >
> > Agree -  but in either case addressing the usage issues with JVMTI
> will
> come down to understanding why JVMTI looks like it does now and how
> other
> approaches may affect the runtime performance of a system.
>
> AG: Agreed. Performance is a big topic for us. Getting this kind of
> information must work fast - nobody wants to wait hours to grab a
> detailed memory snapshot.
>
> >
> >
> > Additional requirements (maybe not in the scope of this JSR)
> >
> >
> I think these are all worth discussing -   if you use the info then we
> should explore if it makes sense to specifiy it.l
>
>
>
> > *       Access to Objects in PermGen
> > *       Generation information when iterating through objects
> >
> >        *       which generations are objects in that live in the heap
> >
> > *       Get access to Generation Sizes via JVMTI
> >
> >        *       Size information is available via JMX
> >        *       so it should also be made available via the native
> > interfaces
> >
> > *       Object Information on GC finished event
> >
> >        *       get information about how many objects have been
> > moved/freed (either real object Id's or at least the size)
> >        *       must be able to turn on/off this feature during runtime
> > to keep overhead low when not needed
> >
> >
> >
> > Let me know if any of these use cases or requirements needs further
> > explanation.
> >
> >
> >
> > Thanks
> >
> > Andi & Alois
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: Alois Reitbauer
> > Sent: Montag, 21. September 2009 17:01
> > To: kato-spec@incubator.apache.org
> > Cc: Andreas Grabner
> > Subject: RE: "Snapshot" support
> >
> >
> >
> > Steve,
> >
> >
> >
> > we will be happy to contribute our use cases. I propose to start with
> >
> > memory dumps first and thread dumps later. Either me or Andi will come
> >
> > back with some concrete use cases.
> >
> >
> >
> > - Alois
> >
> >
> >
> > -----Original Message-----
> >
> > From: Steve Poole [mailto:spoole167@googlemail.com]
> >
> > Sent: Dienstag, 08. September 2009 06:31
> >
> > To: kato-spec@incubator.apache.org
> >
> > Subject: "Snapshot" support
> >
> >
> >
> > One of the capabilities that this API is intended to provide is
> support
> >
> > for
> >
> > "Snapshots"
> >
> >
> >
> > This is  based on the idea that for various reasons the dumps that we
> >
> > can
> >
> > get today can be too big, take too long to generate , not have the
> right
> >
> > information etc.
> >
> >
> >
> > Also we need to recognise that dumps are not only produced to help
> >
> > diagnose
> >
> > a failure.  Some users consume dumps as part of monitoring a live
> >
> > system.
> >
> >
> >
> > So we need to discuss (at least)
> >
> >
> >
> > a)  How dump content configuration would work
> >
> > b)  What sorts of data are needed in a snapshot dump
> >
> >
> >
> > This is the largest outstanding piece of the API.   Now with Alois and
> >
> > Andreas on board we can start to clarify usecases and  resolve the
> >
> > design
> >
> >
> >
> >
> >
> > Cheers
> >
> >
> >
> > Steve
> >
> >
> >
> >
> >
> >
>
>
> --
> Steve
>
>
>


-- 
Steve

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message