incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonal Goyal <sonalgoy...@gmail.com>
Subject Re: Design guidelines
Date Wed, 04 Mar 2009 09:23:02 GMT
I would like to add one thing here. The API should be designed such that the
processing is different from the presentation. Take the case of a service
engineer who is working remotely. He is unable to download the dump (due to
file size) from the client machine but is given remote access to the
 machine. He may not be in a position to use our current set of tools. But
he would benefit greatly if he could deploy a small program on the client
machine which uses our APIs and provides him answers to his questions in a
text format which he can download.
Thanks and Regards,
Sonal


On Thu, Feb 19, 2009 at 5:16 PM, Adam Pilkington <
pilkington.adam@googlemail.com> wrote:

> I would like to add another area for consideration to this list, and that
> is
> the approach that we should use for logging. I think that we should be
> looking to define the following
>
>   - The logging package to use - is this going to be java.util.logging or
>   do we want to use Log4j or something else ?
>   - The namespace conventions so that the logging output from one component
>   does not become corrupted by other components.
>   - Some guidance should be defined on the logging levels that are to be
>   used, and the type of events that occur within those levels. This should
>   mean that if we turn on just error logging we are not swamped with other
>   messages not related to errors.
>
>
>
> 2009/2/17 Adam Pilkington <pilkington.adam@googlemail.com>
>
> > Hi, I would like to add the following comments to this,
> >
> > Section 2.6 : I don't necessarily believe that checked exceptions are the
> > best way to handle this. I think that runtime exceptions should be thrown
> > when the user does not have a realistic chance to correct the underlying
> > cause of the error. Checked exception should be used when there is a
> chance
> > to correct the underlying cause of the exception or take some other form
> of
> > action to allow the operation to continue. I also believe that exceptions
> > are bounded by the architecture layer in which they occur. For example, I
> > was discussing this with Steve and if we look at the data access layer
> then
> > almost all operations on the dump can throw an IOException. I don't
> believe
> > that this should be bubbled up through the layers, but handled by the
> data
> > access layer. This could take the form of wrapping the original exception
> in
> > a runtime or other checked exception and then re-throwing that. One
> > advantage of this approach is that it allows additional information to be
> > conveyed with the exception i.e. not just that you tried to read beyond
> the
> > end of the file, but details of the context in which the operation was
> being
> > carried out.
> >
> > I also think that this is an issue that cannot be just resolved through
> API
> > design, but is part of the  reference implementation that we publish. I
> > would allow Kato to throw almost exclusively runtime exceptions, but in
> the
> > RI implement an error handling strategy, which could be as simple as
> logging
> > the message to a console window, or it could be more sophisticated such
> as
> > interpreting the error and presenting an option to the user to fix it.
> That
> > doesn't mean to say that classes cannot catch these runtime exceptions,
> it
> > just means that they are only going to do so if they intend to either add
> > more information or transform the error again. I think that this will
> make
> > the code a lot easier to read and is a better choice than either checked
> > exceptions or error number checking.
> >
> > Section 4.0 : For the data access models I think that the following
> should
> > also be added for consideration in the layer that directly accesses the
> dump
> >
> >         - Established patterns for iterating over lists
> >         - Marking reference points within the dump to provide jump
> > capabilities to known sections and areas
> >         - Most recently used caches
> >         - Most frequently used caches
> >         - Intelligent block reading of data
> >
> >
> > 2009/2/17 Carmine Cristallo <carminecristallo@gmail.com>
> >
> > The main purpose of this email is to outline some of the design
> >> considerations to be kept into account during the development of the
> >> API and to stimulate the discussion about them.
> >>
> >> Some of the following sections will be better understood after a quick
> >> look at the IBM DTFJ API, which will constitutes the seed of the IBM
> >> contribution to the Apache Kato project. Such sections will be clearly
> >> marked.
> >>
> >> 1 General Principles
> >>
> >> The following principles could be used as overall quality criteria for
> >> the design of the API.
> >>
> >> 1.1 User Story driven: the design of the API will be driven by user
> >> stories. As a general statement, no information should be exposed if
> >> there is no user story justifying the need for it.
> >>
> >> 1.2 Comsumability: users of the API should be able to easily
> >> understand how to use the API from the API description and from common
> >> repeated design patterns. The amount of boilerplate code neccessary to
> >> get at any useful information needs to be monitored. The user stories
> >> supporting the API will aid in keeping the boilerplate down but its
> >> important to state that the more understandable the API is, the easier
> >> its adoption will be.
> >>
> >> 1.3 Consistency: common guidelines and patterns should be followed
> >> designing the API. For example, all the calls returning multiple
> >> values must have a common way of doing it (i.e. Lists or Iterators or
> >> arrays)
> >>
> >> 1.4 Common tasks should be easy to implement: care should be taken to
> >> design the API in such a way that common user stories have a simple
> >> implementation scenario. For example, in the DTFJ API, in most cases
> >> there will be only one JavaRuntime per Image, and there should be a
> >> more direct way of getting it than iterating through the AddressSpaces
> >> and Processes.
> >>
> >> 1.5 Backward compatibility: the client code written against a given
> >> release of the API should remain source code compatible with any
> >> future release of the API.
> >>
> >>
> >>
> >> 2 Exception handling model
> >>
> >> In the domain of postmortem analysis, the following types of
> >> exceptions can occur:
> >>
> >> 2.1 File Access Exceptions: reported when an error occurs opening,
> >> seeking, reading, writing or closing any of the file which constitute
> >> the dump or are generated as a result of processing the dump.
> >> Applications are expected to respond to this type of exception by
> >> informing their users that they should correct the problem with their
> >> file system (e.g. getting the file name right).
> >>
> >> 2.2 File Format Exceptions: reported when data is correctly read from
> >> an encoded file, but that data is not compatible with the encoding
> >> rules or syntax. Applications are supposed to respond to these
> >> exceptions by informing their users that the file is corrupt and
> >> further process is impossible.
> >>
> >> 2.3 Operation Not Supported Exceptions: the type of dump file being
> >> analysed does not support the invoked API.
> >>
> >> 2.4 Memory Access Exceptions: thrown when the address and length of a
> >> data read request does not lie entirely within one of the valid
> >> address ranges.
> >>
> >> 2.5 Corrupt Data Exceptions: reported when data is correctly read from
> >> a file but it has a value incompatible with its nature. Corruption is
> >> to be considered as a normal event in processing postmortem dumps,
> >> therefore such exceptions are not to be treated as error conditions.
> >>
> >> 2.6 Data Not Available Exceptions: reported when the requested
> >> information is not contained within the specific dump being analysed.
> >> As for the previous case, this is not to be seen as an error
> >> condition.
> >>
> >> Exception handling in DTFJ is a major source of struggle. Almost every
> >> call to the DTFJ API throws one exception of the last two types, or
> >> both. There's no question about the fact that such events are
> >> definitely better handled with checked exceptions rather than with
> >> unchecked ones. On the other hand, the fact that some objects
> >> retrieved from a dump can be corrupted or not available is an
> >> intrinsic condition to every API call. Handling such conditions with
> >> checked exceptions would put the burden of handling them onto the
> >> client code, leading to almost every API call being wrapped by a
> >> try/catch block. As a side effect, it has been noted from past
> >> experience that in such situations client code tends to take a form
> >> like:
> >>
> >> public clientMethod1() {
> >>      try {
> >>              katoObject1.methodA();
> >>              katoObject2.methodB();
> >>              katoObject3.methodC();
> >>      } catch (KatoException ke) {
> >>              ...
> >>      }
> >> }
> >>
> >> rather than:
> >>
> >> public clientMethod2() {
> >>      try {
> >>              katoObject1.methodA();
> >>      } catch (KatoException ke1) {
> >>              ...
> >>      }
> >>      try {
> >>              katoObject2.methodB();
> >>      } catch (KatoException ke2) {
> >>              ...
> >>      }
> >>      try {
> >>              katoObject3.methodC();
> >>      } catch (KatoException ke3) {
> >>              ...
> >>      }
> >> }
> >> and this can lead to poor debuggability of the client code.
> >>
> >> It is also true that in very few cases the client code will need to
> >> implement different behaviours for the Data Unavailable and the
> >> Corrupt Data cases: most of the time, they will be treated in the same
> >> way, and the corrupt data, when available, will just be ignored. It
> >> would make sense, therefore, to group the two cases under a single
> >> name: let's therefore define "Invalid Data" a situation where either
> >> the data is not available or it's corrupted. So the key questions
> >> become: "Does it make sense to think of a way to handle the Invalid
> >> Data case without the use of exceptions? If yes, how?"
> >>
> >> One possible solution to this problem could be to reserve the null
> >> return value, in every API call, to Invalid Data: an API call returns
> >> null if and only if the data being requested is either unavailable or
> >> corrupted. To discriminate the two cases, the client code could call a
> >> specific errno-like API which returns the corrupt data, of the latest
> >> API call, or null if the data was unavailable. Most of the time, the
> >> client code would therefore look similar to:
> >>
> >> public clientMethod1() {
> >>      KatoThing value;
> >>      value = katoObject1.methodA();
> >>      if (value == null) {
> >>              // handle the invalid data
> >>      }
> >> }
> >>
> >> although, in a small number of cases, the code might be more similar to
> >> this:
> >>
> >> public clientMethod2() {
> >>      KatoThing value;
> >>      value = katoObject1.methodA();
> >>      if (value == null) {
> >>              CorruptData cd = KatoHelper.getLastCorruptData();
> >>              if (cd == null) {
> >>                      // handle the data unavailable case
> >>              } else {
> >>                      // handle the corrupt data case
> >>              }
> >>      }
> >> }
> >>
> >> As a side effect, this solution would imply that primitive types
> >> cannot be used as return values, and their corresponding object
> >> wrappers would need to be used instead.
> >>
> >>
> >> 3.0 Optionality
> >>
> >> The Kato API will be designed to support different types of dump
> >> formats. Examples of system dumps are HPROF for SUN VMs, system dumps,
> >> Javacores and PHD for IBM VMs, etc.
> >>
> >> Different dump formats expose different information, so if we design
> >> the API as a monolithic block, there will be cases in which some parts
> >> of it – more or less large, depending on the dump format – may not be
> >> implemented.
> >> Although the "Operation Not Supported Exception" case described above
> >> does provide some support for these cases, we certainly need a better
> >> mechanism to support optionality.
> >> One possible solution lies in the consideration that we don't really
> >> need to design for optionality at  method level: normally, dump
> >> formats tend to focus on one or more "views" of the process that
> >> generates them. Examples of these views are:
> >>
> >> 3.1 Process view: formats that support this view expose information
> >> like command line, environment, native threads and locks, stack
> >> frames, loaded libraries, memory layout, symbols, etc. System dumps
> >> normally expose nearly all of these data.
> >>
> >> 3.2 Java Runtime view: formats supporting this view expose information
> >> like VM args, Java Threads, Java Monitors, classloaders, heaps, heap
> >> roots, compiled methods, etc. HPROF is an example of format that
> >> supports this view.
> >>
> >> 3.3 Java Heap view: formats supporting this view expose Java classes,
> >> objects and their relationships. IBM PHD is an example of dump format
> >> supporting this view, as well as SUN's HPROF.
> >>
> >> The API should be designed in order for a given file format to support
> >> one or more of these views, as well as allowing new views to be
> >> plugged in. Inside each view, it could be reasonable to provide a
> >> further level of granularity involving Operation Not Supported
> >> Exceptions.
> >>
> >>
> >> 4 Data access models
> >>
> >> In designing the data access models, care should be taken about the
> >> fact that the API may have to deal with dumps whose size is vastly
> >> greater than available memory. Therefore – and this holds especially
> >> for Java objects in the heap view – creating all the objects in memory
> >> at the moment the dump is opened may not be a good idea.
> >> In this context, user stories will dictate the way data is accessed
> >> from the dump. If it will turn out that heap browsing starting from
> >> the roots, or dominator tree browsing will be major use cases, for
> >> example, it makes sense to think about loading the children of a node
> >> object lazily at the moment the parent object is first displayed, and
> >> not any earlier. A first summary of the ways of accessing objects
> >> could be the following:
> >>
> >> 4.1 retrieve the Java object located at a given address in memory (if
> >> memory is available in the dump, i.e. if the dump supports a Process
> >> view);
> >> 4.2 retrieve all the heap root objects;
> >> 4.3 for any given object, retrieve the objects referenced by it;
> >> 4.4 retrieve all objects satisfying a given query (e.g. all objects of
> >> class java.lang.String, or all objects of any class having a field
> >> named "value"). This will involve having a query language of some form
> >> built-in the API.
> >>
> >> (more to figure out...)
> >>
> >>
> >> Please feel free to share your comments about all the items above, and
> >> to add more....
> >>
> >>
> >>
> >>       Carmine
> >>
> >
> >
> >
> > --
> > Regards
> >
> > Adam Pilkington
> >
>
>
>
> --
> Regards
>
> Adam Pilkington
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message