incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Griffiths <david.griffi...@gmail.com>
Subject Re: Snapshot Design Part II
Date Wed, 27 Jan 2010 15:03:40 GMT
Slightly off-topic but just thought I'd throw it in: one of my dreams a few
years ago was to treat a dump as a "virtual database". Rather than the cost
of loading all the data from a dump into a real database this would provide
a JDBC driver that would make the dump look like a read-only database. The
idea was that because it would be read-only, indexes could be generated on
the fly relatively quickly.

Because you are faking up a database you can then go and re-use stuff that
connects to JDBC. Also no need to learn a new API, query language etc.

It's certainly do-able for a static dump - some students from Southampton
Uni did it with us as a project (but concentrated more on the flat file
proof of concept because they could use it to index their MP3 collections :)

Not appropriate for a live VM though.

Cheers,

Dave

On Mon, Jan 25, 2010 at 3:52 PM, Steve Poole <spoole167@googlemail.com>wrote:

> Let's explore the basic outline of this API.
>
> 1 - It is a VM side API. That is  - it is active on the running system.
>
> 2 - It is declarative.  That means , like SQL,  OQL or similar query
> languages it is an API where the user describes what they want to have
> happen. There are no call backs into user code for selection purposes.
> The
> reasoning behind this is:
>
> A) At the point where this collection process is triggered then JVM may be
> in poor state - at least where running Java is concerned.
>
> B) A declarative form allows the JVM vendor to implement their solution at
> what ever level they choose.
>
> C) Similarly,  larger execution optimisations can be made by the JVM.
>
> D) Having selection criteria written in Java could result in queries
> altering that which is being queried. Ignoring the inefficiencies, there is
> a risk there could be an infinite loop or deadlock.
>
> 3 - It is dynamic - in that the definition can generally be changed up
> until
> the time when the collection is triggered.  It is at least additive in that
> applications may wish to register their specific selections to be dumped
> and
> these selections could be mutually exclusive.  In the event of having an
>  "A
> and Not B"  + "B and Not A" situation the API must resolve this into "A and
> B"
>
> 4 - Multiple instances of the snapshot definition can be created and in
> progress at the same time.
>
> 5 - Definitions  have a mechanism to allow then to define when they would
> be
> triggered. This would cover particular failure events such as exceptions.
>
> 6 -  There would be some concept of a default snapshot that would be
> triggered by the JVM on a failing condition such as Out of Memory.
>
> 7 - The selection process  that chooses what would be in the dump has to
> have at least three component parts
>
> A) A way to define a starting point - this could be a starting object,
> class,  thread , class loader or even the heap.
>
> B) A way to define what should be collected and at what level of
> representation. By package, by classloader ,  matching super/subclasses,
> implements an interface etc.   When reporting an object what gets reported
> -
> all fields, object references (ie some unique id)  , array sizes , array
> contents etc?
>
> C) A way to define range and direction.  Consider whats happens if you
> wanted to get all objects of a type that were contained in a Map.   At the
> API level a Map is a single idea: at the implementation level its a
> collection of objects.  When searching for an instance the search needs to
> either have an understanding of logical structures or just be constrained
> to
> a number of hops in navigating object relationships.  Maybe both.  Consider
> also if you wanted to dump all the threads, their stacks and list the
> object
> references (unique ids) they contain.   That's a different axis to the
> "walk
> the heap" process.
>
> 8 - The API should probably cater for the situation where the selection
> requirements need to be provided to the JVM on start up.  This may be due
> to
> performance issues or because we identify an entity or situation that can
> only be reached during startup.  I don't have an example at this point but
> I
> do want to mention the possibility.
>
>
> 9) Execution time performance of this API is critical  - the design must
> offer the implementer the option of ahead of time compilation for these
> selections.
>
> 10) It needs to be at the appropriate level.   Its easy to see that there
> are some likely scenarios for this API which will require that all objects
> in the heap are visited.   For instance if you wanted to get a list of the
> objects that had a reference to some other object.    Traversing the heap
> for JVMs is a standard activity.  It doesn't seem that difficult to imagine
> a JVMTI extension or equivalent that could provide a callback mechanism for
> each live object found.   On the other hand we don't want to "just" or
> maybe
> "even" provide a C level API since that would constrain the JVM and/or the
> JIT options for optimization.
>
>
> Steve
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message