incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Poole <spoole...@googlemail.com>
Subject JSR 326 and Apache Kato - A "state of the nation" examination
Date Wed, 13 Jan 2010 21:03:49 GMT
Greetings all,

Discussions this year have got off to a good start and  we're also really
close to providing  that first driver which contains the changes we've
discussed over time.    With that in mind I think its worth examining  the
past, present and future of this work.

*A brief recap *

We've been working on this JSR for sometime - since 5 August
2008<http://jcp.org/en/jsr/detail?id=326>to be precise.

At the start of the project we expected to be able to  develop , what I
called the "legs" , under the code contributed  by IBM.  These "legs" were
intended to map the API to the dumps that were available from a Sun JVM -
including being able to read Hotspot data from a core file.    We also
expected to drive quickly towards discussing the form of the future - how to
deal with titanic dumps and how not to have dumps at all.

Most of this didn't happen.  We did write an HPROF reader but we didn't
manage to develop a core file reader for the Hotspot JVM.   In that regard
we also examined the Serviceability Agent
API<http://www.usenix.org/events/jvm01/full_papers/russell/russell_html/index.html>but
there were too many restrictions on use and operating environment.
It
turned out that it was not feasible for Apache Kato to develop a corefile
reader for Hotspot due to licensing issues and more importantly, lack of
skills in Hotspot.

At that point we were somewhat stuck (I did discuss this problem privately
with various JVM vendors but we did not reach a resolution)

All was not lost - we wrote a prototype (in python!)  of a new dump that
used JVMTI.  The dump was the first to contain local variables. We hooked it
up to the Java debugger through our JDI connector  to show that you could
use a familiar interface to analyse your problem.  Java
DBX<http://en.wikipedia.org/wiki/Dbx_%28debugger%29>for corefiles had
arrived.

We also tacked on a JXPath <http://commons.apache.org/jxpath/> based layer
(now in the KatoView tool) that allowed you to query the API without writing
reams of code.

We took  JSR 326 to San Francisco and showed people what we had at JavaOne
BOF4870 <http://cwiki.apache.org/confluence/display/KATO/BOF4870>   and I
got to meet a few of you face to face for the first time.

Afterward JavaOne we rewrote the python prototype in C and started to bring
the first Early Draft
Review<http://cwiki.apache.org/KATO/jsr326specification.data/jsr326-edr-1-2009-08-21.pdf>
together,  although it took a long time to get the EDR on to the JCP site.
Mostly my learning of a new process and dealing with a  licensing concern
where I learned about the concept of "collective
copyright"<http://en.wikipedia.org/wiki/Copyright_collective>

After the EDR was out we started work on the first code release from Apache
Kato (all new stuff to learn). We still hadn't resolved the mismatch between
what data the API said it could offer and our inability to provide said data
(ie no hotspot support).  The answer was to  factor out the relationship
between Java entities and native code entities and make it optional.  Now
those dumps that know nothing about processes or address spaces or even
pointers are not required to fake them.

Finally , and quite recently,  we added in to the API the first attempt at a
standard dump trigger mechanism and we added an additional dump type  that
will help us as we develop the snapshot and optionality designs.

*Today *

Lets look to the present.  its January 2010 and there is a foot of snow
outside my window,  which is unusual for where I live.  What else is unusual
is that we have an Expert Group which has been so very quiet.   It's time to
examine our situation and discuss what else  it is that we need to do to
make this project a success.

 At the highest level we need at least 4 things

   1.  A design that will address our requirements.
   2.  A matching implemention that supports a high percentage of this
   design
   3.  Adoption by JVM vendors
   4.  A user community



*Design*

Do you know what our requirements are?  The original proposal for kato is
here <http://wiki.apache.org/incubator/KatoProposal>  and the JSR is
here<http://jcp.org/en/jsr/detail?id=326>

Are these documents saying  what you expected and want?     The  Early Draft
Review<http://cwiki.apache.org/KATO/jsr326specification.data/jsr326-edr-1-2009-08-21.pdf>
outlines more.


*Implementation *

We're going to provide a binary driver as soon as we possibly can for you
all to use - but you can check out the code and try building and using it
now.  We still have a  technical hurdle, We are hampered by  our inability
to make JVM modifications if necessary.  How should we resolve this?
Remember that we have to be able to provide a Reference Implementation to
match the specification.  We can legitimately justify having some edge
conditions that are not implemented but its no use to anyone if key parts of
the API are not implemented.  Having said that it is reasonable to consider
a middle ground where we specify a new JVM interface that we require to be
provided by JVM vendors.  It depends on technical circumstances but that
approach has more flexibility in implementation - its likely going  to be
easier to ask a JVM vendor to provide data to a new standardized  API.   My
current thinking is that for now we minimise this situation as much as
possible and live with slower implementations  at least until we've resolved
the outstanding questions of adoption by JVM Vendors.

I think we've come to realize that the desire to be able to extract
information about a Hotspot JVM from a corefile is not going to happen and
is actually not necessary.  We've said right from the beginning that dumps
sizes are growing and  we need to consider have smaller dumps.  Rather than
finding a way to read Hotspot data from a corefile we just move directly to
defining  and implementing what a Snapshot Dump mechanism really is.   My
expectation is that we will only need JVM support for a yet to be designed
low level  API which we can use to extract information from a running JVM. I
really don't know what form that API would take - it might be something like
JVMTI , it might be a set of native methods - or it may just be new Java
classes that the JVM Vendor replaces.

What drives this discussion and hence defines what we need from the JVM
vendors comes from  having the Snapshot concept clear in everyone's head.
Since this is new to everyone I want to provide an implementation that
embodies the concepts as a soon as possible so we can argue this through
from a practical hands-on approach.


*Adoption by JVM Vendors*

Adoption by JVM vendors - and by that we mainly mean Sun and Oracle since
IBM already has a similar implementation -  is predicated on usefulness and
a need to have JVM specific code.  If there is no requirement for JVM
specific changes then adoption is  not really an issue.  If we have to have
JVM changes (and we will in the end)  then we need to either have Sun/Oracle
or another JVM vendor develop these JVM changes. Otherwise  we have to find
a 3rd party who is willing to develop a GPL licensed extension to OpenJDK to
support our requirements.

We're going to have to wait a few weeks until the Oracle/Sun acquisition is
completed before we can expect to get a sensible answer to the first
question.   Its also possible  that we could go straight to the OpenJDK folk
and see if they wanted to play.   In either case though we would need to
have a good idea on the type of JVM changes and/or new  data access  we
need.

*User Community *

We need to agree who our users actually are.  I know that there are various
views but lets get it clear.  My view is that our users are the tools
vendors and more expert application programmers out there.  This API may
make life easier for the JVM vendor but only in passing.  The major
objective is to help programmers solve their own problems not to help JVM
vendors fix bugs in the JVM.   Do you agree?

What else makes a user community?  Having something to use is high up the
list.  We need to get what we have out the door and being used.   Its not as
simple as that of course -we need documentation and usage examples and most
importantly we need to be able offer a compelling reason for using our
API.   Right now we're light on all of these.



*What the future holds*

I can't say how much I really appreciate all the time and effort that people
have expended so far on this project. It is a shame that we've not had the
larger buy-in where we expected it but that may change.  I intend to keep
asking.

Right now though now I need to ask more of you : you as a subscriber to this
mailing list , you as a member of the Expert Group, you as a contributor or
committer to Apache Kato and you as a potential user of JSR 326.  I need you
to  tell me if we are on the right track,  are we going in the right
direction or not?  If we are doing what you expected say so as well: its
good to get confirmation.  If we're not addressing the issues you consider
need to be talked about - say so.  If you can help with documentation,
use-cases, evangelism,  coding, testing,  or anything - just say so.


In my view the future of this project  ranges from being *just* the place
where a new modern replacement for HPROF is developed all the way through to
delivering on those objectives we set ourselves in 2008.  I need your help
and active involvement right now  in determining our actual future.

Thanks

-- 
Steve

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message