Author: luc
Date: Wed Apr 23 14:37:08 2008
New Revision: 651074
URL: http://svn.apache.org/viewvc?rev=651074&view=rev
Log:
improved documentation
the developersoriented documentation has been started
Modified:
commons/sandbox/nabla/trunk/src/site/xdoc/index.xml
commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml
commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml
Modified: commons/sandbox/nabla/trunk/src/site/xdoc/index.xml
URL: http://svn.apache.org/viewvc/commons/sandbox/nabla/trunk/src/site/xdoc/index.xml?rev=651074&r1=651073&r2=651074&view=diff
==============================================================================
 commons/sandbox/nabla/trunk/src/site/xdoc/index.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/index.xml Wed Apr 23 14:37:08 2008
@@ 24,7 +24,7 @@
<body>
 <section name="Introduction" href="introduction">
+ <section name="Introduction">
<p>
Nabla is an automatic differentiator for mathematical functions.
Just like the mathematical Nabla operator transforms a function
@@ 72,7 +72,7 @@
</section>
 <section name="Example" href="example">
+ <section name="Example">
<p>
The following example should explain better what Nabla can do
@@ 157,7 +157,7 @@
</section>
 <section name="How ?" href="how">
+ <section name="How ?">
<p>
The previous example shows that Nabla creates an object that
Modified: commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml
URL: http://svn.apache.org/viewvc/commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml?rev=651074&r1=651073&r2=651074&view=diff
==============================================================================
 commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml Wed Apr 23 14:37:08 2008
@@ 23,31 +23,162 @@
</properties>
<body>
 <section name="Principles" href="principles">
 <subsection name="Symbolic differentiation" href="symbolicdifferentiation">
 </subsection>
 <subsection name="Validity" href="validity">
 </subsection>
 <subsection name="Virtual machine execution model" href="jvm">
 <subsubsection name="Frame" href="frame">
 </subsubsection>
 <subsubsection name="Bytecode instructions" href="instructions">
 </subsubsection>
 </subsection>
+ <section name="Principles">
+
+ <subsection name="Symbolic differentiation">
+ <p>
+ Nabla computes the derivatives using symbolic differentiation. Considering
+ a typical call to a univariate function:
+ <pre><code>double r = f(t);</code></pre>
+ Nabla tracks the data flow that leads from the <code>t</code> parameter
to the
+ <code>r</code> result to understand how this result is computed. All
the
+ operations leading from <code>t</code> to <code>r</code>
belong to a small set
+ of instructions corresponding to the capabilities of the virtual machine. This
+ set contains basic arithmetic operations (addition, subtraction ...), conversion
+ operations (double to int, long to double ...), storage instructions (local
+ variables, functions parameters, instance or class fields ...) and calls to
+ elementary functions defined in the <code>Math</code> and <code>StrictMath</code>
+ classes. There is really nothing more!
+ </p>
+ <p>
+ For each one of these basic computer instruction, we know how to map it to a
+ mathematical equation and we can combine this equation with its derivative to
+ form a pair of equations we will use later. For example, a <code>DADD</code>
+ bytecode instruction corresponds to the addition of two real numbers and
+ produces a third number which is their sum. So we map the instruction to the
+ equation:
+ <pre><code>c=a+b</code></pre>
+ and we combine this with its derivative to form the pair:
+ <pre><code>(c=a+b, c'=a'+b')</code></pre>
+ In this example, we have simply used the linearity property of differentiation
+ which implies that the derivative of a sum is the sum of the derivatives.
+ Similar rules exist for all arithmetic instructions, and the derivative of all
+ basic functions in the <code>Math</code> and <code>StrictMath</code>
is known.
+ The complete rules set is described in the <a
+ href="#Differentiation rules">Differentiation rules</a> section below.
+ </p>
+ <p>
+ So the original computation path from <code>t</code> to <code>r</code>
can be
+ expanded by conceptually replacing all single equations that constitute the
+ code by pairs of equations. The first element of each pair is a simple copy
+ of the original equation. The original computation path was fed by a
+ single double value (the function parameter <code>t</code>), but the
expanded
+ computation path needs a pair of values, one for each element in the equations
+ pair. The first element of the pair of input values will be the value of the
+ parameter, and the second will be the <em>derivative of the parameter with
respect
+ to the free variable</em>. This means that in our case, if we want to compute
the
+ derivative with respect to <code>t</code>, the second element will
be the
+ derivative of <code>t</code> with respect to itself, which is simply
the constant 1.
+ In this case, we need to feed the computation path with the pair <code>(t,
1)</code>.
+ </p>
+ <p>
+ Without changing anything to the analysis or to the expansion of the computation
path,
+ we can also handle the case where <code>t</code> is not an independent
variable but is
+ itself a function of another free variable, i.e. a case where:
+ <pre><code>t = g(x)</code>, with <code>dt/dx = g'(x)</code></pre>
+ In this case, we would feed the computation path with the pair <code>(g(x),
g'(x))</code>
+ instead of the pair <code>(t, 1)</code>. This allows to handle functions
composition,
+ including resursive calls.
+ </p>
+ </subsection>
+
+ <subsection name="Validity">
+ <p>
+ What is the validity of this approach?
+ </p>
+ <p>
+ For straightforward smooth functions, the expanded code really computes both the
+ value of the equation and its exact derivative. This is a simple application of
+ the differentiation rules. So the accuracy of the derivative will be in par with
+ the accuracy of the initial function. If the initial function is a good model
+ of a physical process, the derivative will be a good evaluation of its evolution.
+ If the initial function is only an approximation, the derivative will be an
+ approximation too, but <em>an approximation that is consistent with the initial
+ function up to computer accuracy</em>.
+ </p>
+ <p>
+ As soon as the initial function is not smooth, then some design choices are
+ involved which have an impact on validity. These choices have been made in
+ such a way that in some sense, the result is still as valid, as accurate and as
+ consistent with the initial function as for smooth functions.
+ </p>
+ <p>
+ First of all, what are nonsmooth functions in our model, which is based on the
+ operations and functions available in the Java virtual machine? These are either
+ functions that involve calls to nonsmooth functions of the <code>Math</code>
and
+ <code>StrictMath</code> classes near their singularity points (for
example
+ <code>Math.abs</code>, <code>Math.sqrt</code> or <code>Math.log</code>
near zero),
+ or functions for which the computation path includes conditional branches
+ involving computed double parameters (for example convergence loops or <code>if</code>,
+ <code>then</code>, <code>else</code> statements). This
does <em>neither</em> include
+ functions that use only unconditional jumps <em>nor</em> loops with
a number of iterations
+ not related to a computed double parameter. Such code could theoretically be
+ reorganized and loops unrolled to produce a (perhaps huge) straightforward smooth
+ computation path.
+ </p>
+ <p>
+ For singularities corresponding to domain definition boundaries (like
+ <code>Math.sqrt</code> and <code>Math.log</code> which
cannot be computed for
+ negative parameters), the theoretical derivative is defined only on the side of
the
+ singularity where the function itself is defined. The value of this halfderivative
+ is the limit value of the derivative when approaching the singularity. Since for
these
+ functions we use the expression of the derivative that is valid where the function
+ is valid, our computation is consistent with the theoretical mathematical definition.
+ For example in both the <code>Math.sqrt</code> and <code>Math.log</code>,
the
+ derivative is infinite at zero, with the proper sign according to the sign of the
+ input derivative.
+ </p>
+ <p>
+ For singularities not related to domain definition boundaries (like
+ <code>Math.abs</code> and conditional branches), the theoretical derivative
is not
+ defined as a single value, but as a pair of left and a right halfderivatives,
one for
+ each side of the singularity. Since there is little support in the IEEE754 standard
+ to distinguish the left and right hand side of a single value (except for zero,
since
+ 0 and +0 both exist), we have decided to adopt a simplified approach. These cases
are
+ implemented by simple conditional branches (we added explicitly such a conditional
in the
+ <code>Math.abs</code> case). Nabla then simply computes the value of
the smooth
+ derivative on the branch of the computation path that is selected at run time,
depending
+ on the values of the input parameters. This choice allows to preserve the property
of
+ having a derivative that is always consistent with the associated value, and it
is a simple
+ arbitrary choice of one of the two possibilities that correspond to the mathematical
result,
+ which by itself does not choose between them.
+ </p>
+ </subsection>
+
+ <subsection name="Virtual machine execution model">
+ <subsubsection name="Frame">
+ </subsubsection>
+
+ <subsubsection name="Bytecode instructions">
+ </subsubsection>
+ </subsection>
+
</section>
 <section name="Implementation" href="implementation">
 <subsection name="Differential pairs" href="differentialpairs">
 </subsection>
 <subsection name="Bytecode transforms" href="bytecodetransforms">
 </subsection>
 <subsection name="Complete differentiation example" href="example">
 </subsection>
+
+ <section name="Implementation">
+
+ <subsection name="Differential pairs">
+ </subsection>
+
+ <subsection name="Bytecode transforms">
+ </subsection>
+
+ <subsection name="Differentiation rules">
+ </subsection>
+
+ <subsection name="Complete differentiation example">
+ </subsection>
+
</section>
 <section name="Issues" href="issues">
 <subsection name="Singularities handling" href="singularities">
 </subsection>
 <subsection name="Data flow and control flow analysis" href="flowsanalysis">
 </subsection>
+
+ <section name="Issues">
+
+ <subsection name="Singularities handling">
+ </subsection>
+
+ <subsection name="Data flow and control flow analysis">
+ </subsection>
+
</section>
</body>
</document>
Modified: commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml
URL: http://svn.apache.org/viewvc/commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml?rev=651074&r1=651073&r2=651074&view=diff
==============================================================================
 commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml Wed Apr 23 14:37:08 2008
@@ 23,7 +23,7 @@
</properties>
<body>
 <section name="Public API" href="api">
+ <section name="Public API">
<p>
Nabla public interface is very small, it is composed of only three
interfaces and two classes.
@@ 85,7 +85,7 @@
</section>
 <section name="Updating the base and derived objects" href="updating">
+ <section name="Updating the base and derived objects">
<p>
One important thing to note is a consequence of the fact that the
