incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicholas Sterling <Nicholas.Sterl...@Sun.COM>
Subject Re: Kato API javadoc
Date Wed, 08 Apr 2009 06:37:06 GMT


Daniel Julin wrote:
> Some thoughts about these iterators, also based on earlier experiences with
> the DumpAnalyzer tool:
>
>
> (1) The set of entries returned by some of these iterators can be quite
> large. Especially for JavaHeap.getObjects(), and also, to a lesser degree,
> JavaClassLoader.getDefinedClasses(), ImageModule.getSymbols(), etc. If we
> go to a model that allows random access, can we do it in a way that avoids
> forcing a pre-loading of every instance of every entry in the set, onto the
> heap of the JVM in which we are running the analyzer tool.
>
>
>   
Yes, and we should be very careful about assuming that a particular list 
will be small in all future VMs.

Rather than returning an Iterator for the objects on the heap, why not 
make JavaHeap extend Iterable?  That way we could write

    for ( JavaObject object : heap ) { ... }

without having to call getObjects().  If we are using generics, I 
believe this would work for both objects and sections, because we would 
extend both Iterable<JavaObject> and Iterable<ImageSection>,  i.e.

    for ( JavaObject   object  : heap ) { ... }
    for ( ImageSection section : heap ) { ... }

> (2) The API does not actually define any specific order in which the
> entries are returned by the various iterators, and I wish it would. For
> example:
>
> * When walking through all the objects on a heap, some heap analysis
> functions (and some file formats like PHD) require that the objects be
> returned in the order corresponding to the increasing addresses where they
> are located on that heap.  It turns out that, in all the implementations
> that I've seen so far, JavaHeap.getObjects() does provide that order, but
> nowhere does it say that I can count on it in every future implementation.
>   
Of course, for some future collector it might be difficult to produce 
the objects in that order.  What if there were a way of setting the 
order you want, and if the provider can't do it, then it could throw an 
exception saying that that's unimplemented.
> * When writing unit tests for analysis tools, it is very convenient to be
> able to count on a stable order returned by every iterator. Many tests can
> be written by printing all the output from some function, and simply
> comparing that output from one run of the tool to the next to detect
> regressions. This stable order was actually not provided by many iterators
> in current DTFJ implementations, and has caused considerable headache in
> the early days of the implementation of the DumpAnalyzer tool. We ended-up
> adding our own sort routines on top of many DTFJ iterators... which of
> course conflicts with point (1) above, to avoid pre-loading every entry.
>
>
> (3) At various times in the past, we had some discussions about potentially
> using a cursor-style API rather than an iterator, for large sets like the
> one from JavaHeap.getObjects(), to avoid excessive object allocation in the
> analyzer tool JVM.  How do you feel about that?
>
>   
Between fast allocation techniques, fast GC for short-lived objects, and 
the use of escape analysis to enable stack allocation, I'm not sure 
there's a big win in a cursor-style API.  Is the main disadvantage of a 
cursor-style API that you have to copy objects if you want to remember 
them, as opposed to just hanging on to references as they fly by?

Nicholas



>
> -- Daniel --
>
>
>
> Steve Poole <spoole167@googlemail.com> wrote on 2009-04-06 05:19:07 PM:
>   
>> On Mon, Apr 6, 2009 at 9:45 PM, Nicholas Sterling
>>     
> <Nicholas.Sterling@sun.com
>   
>>> wrote:
>>>       
>>> Carmine Cristallo wrote:
>>>
>>>       
>>>> Hi all!A number of potential work items are starting to emerge from
>>>>         
> this
>   
>>>> thread. I'll try to enumerate them:
>>>>
>>>> 1. refactor the API to use generics. Basically, every method that now
>>>> returns an Iterator should have its signature modified. And while
>>>>         
> we're at
>   
>>>> it... is Iterator<...> the right "thing" to return? Wouldn't a
>>>>         
> collection
>   
>>>> (List<...>?) suit better?
>>>>
>>>>
>>>>         
>>> Ah, good point, although there might be some cases where the number of
>>> values would be huge (like objects on the heap), and for those you
>>>       
> probably
>   
>>> do just want to provide an Iterator.
>>>
>>>       
>> The main problem with iterators is providing any sort of random access.
>>     
> You
>   
>> end up with repeated walks through the iterator.   Having lists seemed
>>     
> like
>   
>> a good alternative.  I took a copy of DTFJ last year and replaced the
>> iterators with lists -  then I  easily added a jxpath  (
>> http://commons.apache.org/jxpath/)  layer on top.  It was pretty cool -
>>     
> if
>   
>> we agree on providing a random access mechanism of any sort it should be
>> fairly simple to redo the jxpath experiment.
>>
>>     
>>> That reminds me -- Steve, weren't you talking about an event-driven
>>> (SAX-like) parser at one point, in which one specifies the classes of
>>> interest in the target heap and the code to be executed for each? I
>>>       
> suppose
>   
>>> that could, and probably should be, build on *top* of an Iterator.
>>>
>>>       
>> Yes - I was wondering out loud if the "DOM" approach we have with the API
>>     
> is
>   
>> sufficent.   Its obvious when you start implementing the underlying
>>     
> support
>   
>> for this sort of API just how important it is to be lazy. A "SAX" like
>> approach could possibly provide  oppotunities to be even lazier!.
>>
>>     
>>> Nicholas
>>>
>>>
>>>  2. give a better implementation - as the one suggested by Nicholas -
>>>       
> to
>   
>>>> ImageThread#getRegister(). How about renaming it to "getRegisterMap
>>>>         
> ()",
>   
>>>> returning a RegisterMap interface?
>>>> 3. refactor the TCK to decouple the setup classes from the test
>>>>         
> classes,
>   
>>>> as
>>>> suggested by Stuart.
>>>>
>>>> Steve... should we start to open Jira work items for the above
>>>>         
> activities?
>   
>>>>     Carmine
>>>>
>>>> On Mon, Apr 6, 2009 at 8:24 PM, Nicholas Sterling <
>>>> Nicholas.Sterling@sun.com
>>>>
>>>>
>>>>         
>>>>> wrote:
>>>>>
>>>>>
>>>>>           
>>>>
>>>>         
>>>>> It's very nice being able to look at this javadoc -- thanks!
>>>>>
>>>>> It might help to have a little introductory text in some of the key
>>>>> classes
>>>>> giving some context, something along the lines of the DTFJ example on
>>>>> your
>>>>> web site that opens a core dump and iterates through the threads.
>>>>>
>>>>> I wonder if ImageThread should return interface RegisterSet, of which
>>>>> there
>>>>> would be various implementations for various CPU types, each
>>>>>           
> containing a
>   
>>>>> map from a Register enum to a RegisterValue.
>>>>>
>>>>> I hadn't realized until I started looking through this javadoc how
>>>>>           
> much
>   
>>>>> easier the use of generics makes it to understand an API.  For
>>>>>           
> example,
>   
>>>>> under JavaClassLoader I see methods getCachedClasses() and
>>>>> getDefinedClasses(), but I can't tell from their signatures whether
>>>>>           
> they
>   
>>>>> return the same type or not.  That info is in the method
>>>>>           
> descriptions,
>   
>>>>> but
>>>>> it's a lot more work to flip back and forth between the Summary and
>>>>>           
> the
>   
>>>>> Detail.
>>>>>
>>>>> Nicholas
>>>>>
>>>>>
>>>>>
>>>>> Steve Poole wrote:
>>>>>
>>>>>
>>>>>
>>>>>           
>>>>>> On Mon, Apr 6, 2009 at 6:51 AM, Nicholas Sterling <
>>>>>> Nicholas.Sterling@sun.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>
>>>>>>             
>>>>>>> Great!  I've passed this on to the HotSpot folks for comment.
>>>>>>>
>>>>>>> I think I remember us talking about there being some provision
for
>>>>>>> accessing the vendor-specific VM constructs that implement the
>>>>>>>               
> heap,
>   
>>>>>>> etc.,
>>>>>>> in addition to the Java objects in it.  Will that be done by,
for
>>>>>>> example,
>>>>>>> casting a JavaVM to a HotSpotVM and using the latter's extra
>>>>>>>               
> methods?
>   
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> That's the most obvious solution I think.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> Also, I'm seeing methods that return Iterator with no type (e.g.
in
>>>>>>> JavaMethod).  Is that just a temporary placeholder which will
>>>>>>> ultimately
>>>>>>> get
>>>>>>> a type?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> That's an interesting question.   The reason for there being no type
>>>>>> info
>>>>>> is
>>>>>> that the API was designed to compile and run on 1.4.2.
>>>>>> We need to decide if that still makes sense.   I know that 1.4 is
>>>>>>             
> out of
>   
>>>>>> support by Sun and IBM.    What about Oracle?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> Nicholas
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Steve Poole wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>>> Well at last ! -  we actually have the API javdoc available
-
>>>>>>>>                 
> it's
>   
>>>>>>>> here
>>>>>>>>
>>>>>>>>
>>>>>>>> http://hudson.zones.apache.org/hudson/view/Kato/job/kato.api-
>>>>>>>>                 
>> head/javadoc/
>>     
>>>>>>>> I'm certainly not going to hold this up as a the greatest
javadoc
>>>>>>>>                 
> in
>   
>>>>>>>> the
>>>>>>>> world but its a good place to start.  I do feel that we have
>>>>>>>>                 
> finally
>   
>>>>>>>> arrived :-)
>>>>>>>>
>>>>>>>> The API has lots of "DTFJ"ness to it that needs to go but
I'm
>>>>>>>>                 
> really
>   
>>>>>>>> interested in intitial reactions to the javadoc -  is the
form of
>>>>>>>>                 
> the
>   
>>>>>>>> API
>>>>>>>> what you expected?
>>>>>>>>
>>>>>>>>
>>>>>>>> Moving on - there is still code needed to make the API work
(we
>>>>>>>>                 
> need
>   
>>>>>>>> to
>>>>>>>> get
>>>>>>>> the hprof support working)   but  we can make progress in
the
>>>>>>>>                 
> interim.
>   
>>>>>>>>  I
>>>>>>>> want to move quickly towards having a regular heat beat where
we
>>>>>>>>                 
> are
>   
>>>>>>>> moving
>>>>>>>> through the usecases that we have.  To do that we need to
 get  up
>>>>>>>>                 
> to
>   
>>>>>>>> speed
>>>>>>>> with the API shape as it stands today.    Stuart has published
>>>>>>>>                 
> some
>   
>>>>>>>> info
>>>>>>>> on
>>>>>>>> the  API but its not really sufficent for educational needs
:-)
>>>>>>>>
>>>>>>>> Is it worth holding a conference call so that we can walk
through
>>>>>>>>                 
> the
>   
>>>>>>>> API
>>>>>>>> to
>>>>>>>> explain why its the shape it is or is everyone comfortable
with
>>>>>>>>                 
> just
>   
>>>>>>>> more
>>>>>>>> doc?
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>>
>>>>>>>> Steve
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>         
>
>   

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message