incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Poole <spoole...@googlemail.com>
Subject Re: Design guidelines
Date Tue, 03 Mar 2009 17:01:47 GMT
On Tue, Feb 17, 2009 at 4:38 PM, Carmine Cristallo <
carminecristallo@gmail.com> wrote:

> The main purpose of this email is to outline some of the design
> considerations to be kept into account during the development of the
> API and to stimulate the discussion about them.
>
> Some of the following sections will be better understood after a quick
> look at the IBM DTFJ API, which will constitutes the seed of the IBM
> contribution to the Apache Kato project. Such sections will be clearly
> marked.
>
> 1 General Principles
>
> The following principles could be used as overall quality criteria for
> the design of the API.
>
> 1.1 User Story driven: the design of the API will be driven by user
> stories. As a general statement, no information should be exposed if
> there is no user story justifying the need for it.
>
> 1.2 Comsumability: users of the API should be able to easily
> understand how to use the API from the API description and from common
> repeated design patterns. The amount of boilerplate code neccessary to
> get at any useful information needs to be monitored. The user stories
> supporting the API will aid in keeping the boilerplate down but its
> important to state that the more understandable the API is, the easier
> its adoption will be.
>
> 1.3 Consistency: common guidelines and patterns should be followed
> designing the API. For example, all the calls returning multiple
> values must have a common way of doing it (i.e. Lists or Iterators or
> arrays)
>
> 1.4 Common tasks should be easy to implement: care should be taken to
> design the API in such a way that common user stories have a simple
> implementation scenario. For example, in the DTFJ API, in most cases
> there will be only one JavaRuntime per Image, and there should be a
> more direct way of getting it than iterating through the AddressSpaces
> and Processes.
>
> 1.5 Backward compatibility: the client code written against a given
> release of the API should remain source code compatible with any
> future release of the API.
>
>
>
> 2 Exception handling model
>
> In the domain of postmortem analysis, the following types of
> exceptions can occur:
>
> 2.1 File Access Exceptions: reported when an error occurs opening,
> seeking, reading, writing or closing any of the file which constitute
> the dump or are generated as a result of processing the dump.
> Applications are expected to respond to this type of exception by
> informing their users that they should correct the problem with their
> file system (e.g. getting the file name right).
>
> 2.2 File Format Exceptions: reported when data is correctly read from
> an encoded file, but that data is not compatible with the encoding
> rules or syntax. Applications are supposed to respond to these
> exceptions by informing their users that the file is corrupt and
> further process is impossible.
>
> 2.3 Operation Not Supported Exceptions: the type of dump file being
> analysed does not support the invoked API.
>
> 2.4 Memory Access Exceptions: thrown when the address and length of a
> data read request does not lie entirely within one of the valid
> address ranges.
>
> 2.5 Corrupt Data Exceptions: reported when data is correctly read from
> a file but it has a value incompatible with its nature. Corruption is
> to be considered as a normal event in processing postmortem dumps,
> therefore such exceptions are not to be treated as error conditions.
>
> 2.6 Data Not Available Exceptions: reported when the requested
> information is not contained within the specific dump being analysed.
> As for the previous case, this is not to be seen as an error
> condition.
>
> Exception handling in DTFJ is a major source of struggle. Almost every
> call to the DTFJ API throws one exception of the last two types, or
> both. There's no question about the fact that such events are
> definitely better handled with checked exceptions rather than with
> unchecked ones. On the other hand, the fact that some objects
> retrieved from a dump can be corrupted or not available is an
> intrinsic condition to every API call. Handling such conditions with
> checked exceptions would put the burden of handling them onto the
> client code, leading to almost every API call being wrapped by a
> try/catch block. As a side effect, it has been noted from past
> experience that in such situations client code tends to take a form
> like:
>
> public clientMethod1() {
>      try {
>              katoObject1.methodA();
>              katoObject2.methodB();
>              katoObject3.methodC();
>      } catch (KatoException ke) {
>              ...
>      }
> }
>
> rather than:
>
> public clientMethod2() {
>      try {
>              katoObject1.methodA();
>      } catch (KatoException ke1) {
>              ...
>      }
>      try {
>              katoObject2.methodB();
>      } catch (KatoException ke2) {
>              ...
>      }
>      try {
>              katoObject3.methodC();
>      } catch (KatoException ke3) {
>              ...
>      }
> }
> and this can lead to poor debuggability of the client code.
>
> It is also true that in very few cases the client code will need to
> implement different behaviours for the Data Unavailable and the
> Corrupt Data cases: most of the time, they will be treated in the same
> way, and the corrupt data, when available, will just be ignored. It
> would make sense, therefore, to group the two cases under a single
> name: let's therefore define "Invalid Data" a situation where either
> the data is not available or it's corrupted. So the key questions
> become: "Does it make sense to think of a way to handle the Invalid
> Data case without the use of exceptions? If yes, how?"
>
> One possible solution to this problem could be to reserve the null
> return value, in every API call, to Invalid Data: an API call returns
> null if and only if the data being requested is either unavailable or
> corrupted. To discriminate the two cases, the client code could call a
> specific errno-like API which returns the corrupt data, of the latest
> API call, or null if the data was unavailable. Most of the time, the
> client code would therefore look similar to:
>
> public clientMethod1() {
>      KatoThing value;
>      value = katoObject1.methodA();
>      if (value == null) {
>              // handle the invalid data
>      }
> }
>
> although, in a small number of cases, the code might be more similar to
> this:
>
> public clientMethod2() {
>      KatoThing value;
>      value = katoObject1.methodA();
>      if (value == null) {
>              CorruptData cd = KatoHelper.getLastCorruptData();
>              if (cd == null) {
>                      // handle the data unavailable case
>              } else {
>                      // handle the corrupt data case
>              }
>      }
> }
>
> As a side effect, this solution would imply that primitive types
> cannot be used as return values, and their corresponding object
> wrappers would need to be used instead.
>

Maybe having nulls at all is not a good idea - see here
http://developers.slashdot.org/article.pl?sid=09/03/03/1459209&from=rss


>
>
> 3.0 Optionality
>
> The Kato API will be designed to support different types of dump
> formats. Examples of system dumps are HPROF for SUN VMs, system dumps,
> Javacores and PHD for IBM VMs, etc.
>
> Different dump formats expose different information, so if we design
> the API as a monolithic block, there will be cases in which some parts
> of it – more or less large, depending on the dump format – may not be
> implemented.
> Although the "Operation Not Supported Exception" case described above
> does provide some support for these cases, we certainly need a better
> mechanism to support optionality.
> One possible solution lies in the consideration that we don't really
> need to design for optionality at  method level: normally, dump
> formats tend to focus on one or more "views" of the process that
> generates them. Examples of these views are:
>
> 3.1 Process view: formats that support this view expose information
> like command line, environment, native threads and locks, stack
> frames, loaded libraries, memory layout, symbols, etc. System dumps
> normally expose nearly all of these data.
>
> 3.2 Java Runtime view: formats supporting this view expose information
> like VM args, Java Threads, Java Monitors, classloaders, heaps, heap
> roots, compiled methods, etc. HPROF is an example of format that
> supports this view.
>
> 3.3 Java Heap view: formats supporting this view expose Java classes,
> objects and their relationships. IBM PHD is an example of dump format
> supporting this view, as well as SUN's HPROF.
>
> The API should be designed in order for a given file format to support
> one or more of these views, as well as allowing new views to be
> plugged in. Inside each view, it could be reasonable to provide a
> further level of granularity involving Operation Not Supported
> Exceptions.
>
>
> 4 Data access models
>
> In designing the data access models, care should be taken about the
> fact that the API may have to deal with dumps whose size is vastly
> greater than available memory. Therefore – and this holds especially
> for Java objects in the heap view – creating all the objects in memory
> at the moment the dump is opened may not be a good idea.
> In this context, user stories will dictate the way data is accessed
> from the dump. If it will turn out that heap browsing starting from
> the roots, or dominator tree browsing will be major use cases, for
> example, it makes sense to think about loading the children of a node
> object lazily at the moment the parent object is first displayed, and
> not any earlier. A first summary of the ways of accessing objects
> could be the following:
>
> 4.1 retrieve the Java object located at a given address in memory (if
> memory is available in the dump, i.e. if the dump supports a Process
> view);
> 4.2 retrieve all the heap root objects;
> 4.3 for any given object, retrieve the objects referenced by it;
> 4.4 retrieve all objects satisfying a given query (e.g. all objects of
> class java.lang.String, or all objects of any class having a field
> named "value"). This will involve having a query language of some form
> built-in the API.
>
> (more to figure out...)
>
>
> Please feel free to share your comments about all the items above, and
> to add more....
>
>
>
>       Carmine
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message