incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Julin <...@us.ibm.com>
Subject Re: Kato API javadoc
Date Wed, 08 Apr 2009 01:19:23 GMT

Some thoughts about these iterators, also based on earlier experiences with
the DumpAnalyzer tool:


(1) The set of entries returned by some of these iterators can be quite
large. Especially for JavaHeap.getObjects(), and also, to a lesser degree,
JavaClassLoader.getDefinedClasses(), ImageModule.getSymbols(), etc. If we
go to a model that allows random access, can we do it in a way that avoids
forcing a pre-loading of every instance of every entry in the set, onto the
heap of the JVM in which we are running the analyzer tool.



(2) The API does not actually define any specific order in which the
entries are returned by the various iterators, and I wish it would. For
example:

* When walking through all the objects on a heap, some heap analysis
functions (and some file formats like PHD) require that the objects be
returned in the order corresponding to the increasing addresses where they
are located on that heap.  It turns out that, in all the implementations
that I've seen so far, JavaHeap.getObjects() does provide that order, but
nowhere does it say that I can count on it in every future implementation.

* When writing unit tests for analysis tools, it is very convenient to be
able to count on a stable order returned by every iterator. Many tests can
be written by printing all the output from some function, and simply
comparing that output from one run of the tool to the next to detect
regressions. This stable order was actually not provided by many iterators
in current DTFJ implementations, and has caused considerable headache in
the early days of the implementation of the DumpAnalyzer tool. We ended-up
adding our own sort routines on top of many DTFJ iterators... which of
course conflicts with point (1) above, to avoid pre-loading every entry.



(3) At various times in the past, we had some discussions about potentially
using a cursor-style API rather than an iterator, for large sets like the
one from JavaHeap.getObjects(), to avoid excessive object allocation in the
analyzer tool JVM.  How do you feel about that?



-- Daniel --



Steve Poole <spoole167@googlemail.com> wrote on 2009-04-06 05:19:07 PM:
>
> On Mon, Apr 6, 2009 at 9:45 PM, Nicholas Sterling
<Nicholas.Sterling@sun.com
> > wrote:
>
> >
> >
> > Carmine Cristallo wrote:
> >
> >> Hi all!A number of potential work items are starting to emerge from
this
> >> thread. I'll try to enumerate them:
> >>
> >> 1. refactor the API to use generics. Basically, every method that now
> >> returns an Iterator should have its signature modified. And while
we're at
> >> it... is Iterator<...> the right "thing" to return? Wouldn't a
collection
> >> (List<...>?) suit better?
> >>
> >>
> > Ah, good point, although there might be some cases where the number of
> > values would be huge (like objects on the heap), and for those you
probably
> > do just want to provide an Iterator.
> >
>
> The main problem with iterators is providing any sort of random access.
You
> end up with repeated walks through the iterator.   Having lists seemed
like
> a good alternative.  I took a copy of DTFJ last year and replaced the
> iterators with lists -  then I  easily added a jxpath  (
> http://commons.apache.org/jxpath/)  layer on top.  It was pretty cool -
if
> we agree on providing a random access mechanism of any sort it should be
> fairly simple to redo the jxpath experiment.
>
> >
> > That reminds me -- Steve, weren't you talking about an event-driven
> > (SAX-like) parser at one point, in which one specifies the classes of
> > interest in the target heap and the code to be executed for each? I
suppose
> > that could, and probably should be, build on *top* of an Iterator.
> >
>
> Yes - I was wondering out loud if the "DOM" approach we have with the API
is
> sufficent.   Its obvious when you start implementing the underlying
support
> for this sort of API just how important it is to be lazy. A "SAX" like
> approach could possibly provide  oppotunities to be even lazier!.
>
> >
> > Nicholas
> >
> >
> >  2. give a better implementation - as the one suggested by Nicholas -
to
> >> ImageThread#getRegister(). How about renaming it to "getRegisterMap
()",
> >> returning a RegisterMap interface?
> >> 3. refactor the TCK to decouple the setup classes from the test
classes,
> >> as
> >> suggested by Stuart.
> >>
> >> Steve... should we start to open Jira work items for the above
activities?
> >>
> >>
> >>     Carmine
> >>
> >> On Mon, Apr 6, 2009 at 8:24 PM, Nicholas Sterling <
> >> Nicholas.Sterling@sun.com
> >>
> >>
> >>> wrote:
> >>>
> >>>
> >>
> >>
> >>
> >>> It's very nice being able to look at this javadoc -- thanks!
> >>>
> >>> It might help to have a little introductory text in some of the key
> >>> classes
> >>> giving some context, something along the lines of the DTFJ example on
> >>> your
> >>> web site that opens a core dump and iterates through the threads.
> >>>
> >>> I wonder if ImageThread should return interface RegisterSet, of which
> >>> there
> >>> would be various implementations for various CPU types, each
containing a
> >>> map from a Register enum to a RegisterValue.
> >>>
> >>> I hadn't realized until I started looking through this javadoc how
much
> >>> easier the use of generics makes it to understand an API.  For
example,
> >>> under JavaClassLoader I see methods getCachedClasses() and
> >>> getDefinedClasses(), but I can't tell from their signatures whether
they
> >>> return the same type or not.  That info is in the method
descriptions,
> >>> but
> >>> it's a lot more work to flip back and forth between the Summary and
the
> >>> Detail.
> >>>
> >>> Nicholas
> >>>
> >>>
> >>>
> >>> Steve Poole wrote:
> >>>
> >>>
> >>>
> >>>> On Mon, Apr 6, 2009 at 6:51 AM, Nicholas Sterling <
> >>>> Nicholas.Sterling@sun.com
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>>> Great!  I've passed this on to the HotSpot folks for comment.
> >>>>>
> >>>>> I think I remember us talking about there being some provision for
> >>>>> accessing the vendor-specific VM constructs that implement the
heap,
> >>>>> etc.,
> >>>>> in addition to the Java objects in it.  Will that be done by, for
> >>>>> example,
> >>>>> casting a JavaVM to a HotSpotVM and using the latter's extra
methods?
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>> That's the most obvious solution I think.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> Also, I'm seeing methods that return Iterator with no type (e.g.
in
> >>>>> JavaMethod).  Is that just a temporary placeholder which will
> >>>>> ultimately
> >>>>> get
> >>>>> a type?
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>> That's an interesting question.   The reason for there being no type
> >>>> info
> >>>> is
> >>>> that the API was designed to compile and run on 1.4.2.
> >>>> We need to decide if that still makes sense.   I know that 1.4 is
out of
> >>>> support by Sun and IBM.    What about Oracle?
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> Nicholas
> >>>>>
> >>>>>
> >>>>>
> >>>>> Steve Poole wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>> Well at last ! -  we actually have the API javdoc available
-
it's
> >>>>>> here
> >>>>>>
> >>>>>>
> >>>>>> http://hudson.zones.apache.org/hudson/view/Kato/job/kato.api-
> head/javadoc/
> >>>>>>
> >>>>>> I'm certainly not going to hold this up as a the greatest javadoc
in
> >>>>>> the
> >>>>>> world but its a good place to start.  I do feel that we have
finally
> >>>>>> arrived :-)
> >>>>>>
> >>>>>> The API has lots of "DTFJ"ness to it that needs to go but I'm
really
> >>>>>> interested in intitial reactions to the javadoc -  is the form
of
the
> >>>>>> API
> >>>>>> what you expected?
> >>>>>>
> >>>>>>
> >>>>>> Moving on - there is still code needed to make the API work
(we
need
> >>>>>> to
> >>>>>> get
> >>>>>> the hprof support working)   but  we can make progress in the
interim.
> >>>>>>  I
> >>>>>> want to move quickly towards having a regular heat beat where
we
are
> >>>>>> moving
> >>>>>> through the usecases that we have.  To do that we need to  get
 up
to
> >>>>>> speed
> >>>>>> with the API shape as it stands today.    Stuart has published
some
> >>>>>> info
> >>>>>> on
> >>>>>> the  API but its not really sufficent for educational needs
:-)
> >>>>>>
> >>>>>> Is it worth holding a conference call so that we can walk through
the
> >>>>>> API
> >>>>>> to
> >>>>>> explain why its the shape it is or is everyone comfortable with
just
> >>>>>> more
> >>>>>> doc?
> >>>>>>
> >>>>>> Cheers
> >>>>>>
> >>>>>> Steve
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
> >
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message