commons-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject svn commit: r651074 - in /commons/sandbox/nabla/trunk/src/site/xdoc: index.xml internals.xml usage.xml
Date Wed, 23 Apr 2008 21:37:09 GMT
Author: luc
Date: Wed Apr 23 14:37:08 2008
New Revision: 651074

improved documentation
the developers-oriented documentation has been started


Modified: commons/sandbox/nabla/trunk/src/site/xdoc/index.xml
--- commons/sandbox/nabla/trunk/src/site/xdoc/index.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/index.xml Wed Apr 23 14:37:08 2008
@@ -24,7 +24,7 @@
-    <section name="Introduction" href="introduction">
+    <section name="Introduction">
         Nabla is an automatic differentiator for mathematical functions.
         Just like the mathematical Nabla operator transforms a function
@@ -72,7 +72,7 @@
-    <section name="Example" href="example">
+    <section name="Example">
         The following example should explain better what Nabla can do
@@ -157,7 +157,7 @@
-    <section name="How ?" href="how">
+    <section name="How ?">
         The previous example shows that Nabla creates an object that

Modified: commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml
--- commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml Wed Apr 23 14:37:08 2008
@@ -23,31 +23,162 @@
-    <section name="Principles" href="principles">
-        <subsection name="Symbolic differentiation" href="symbolic-differentiation">
-        </subsection>
-        <subsection name="Validity" href="validity">
-        </subsection>
-        <subsection name="Virtual machine execution model" href="jvm">
-            <subsubsection name="Frame" href="frame">
-            </subsubsection>
-            <subsubsection name="Bytecode instructions" href="instructions">
-            </subsubsection>
-        </subsection>
+    <section name="Principles">
+      <subsection name="Symbolic differentiation">
+        <p>
+          Nabla computes the derivatives using symbolic differentiation. Considering
+          a typical call to a univariate function:
+          <pre><code>double r = f(t);</code></pre>
+          Nabla tracks the data flow that leads from the <code>t</code> parameter
to the
+          <code>r</code> result to understand how this result is computed. All
+          operations leading from <code>t</code> to <code>r</code>
belong to a small set
+          of instructions corresponding to the capabilities of the virtual machine. This
+          set contains basic arithmetic operations (addition, subtraction ...), conversion
+          operations (double to int, long to double ...), storage instructions (local
+          variables, functions parameters, instance or class fields ...) and calls to
+          elementary functions defined in the <code>Math</code> and <code>StrictMath</code>
+          classes. There is really nothing more!
+        </p>
+        <p>
+          For each one of these basic computer instruction, we know how to map it to a
+          mathematical equation and we can combine this equation with its derivative to
+          form a pair of equations we will use later. For example, a <code>DADD</code>
+          bytecode instruction corresponds to the addition of two real numbers and
+          produces a third number which is their sum. So we map the instruction to the
+          equation:
+          <pre><code>c=a+b</code></pre>
+          and we combine this with its derivative to form the pair:
+          <pre><code>(c=a+b, c'=a'+b')</code></pre>
+          In this example, we have simply used the linearity property of differentiation
+          which implies that the derivative of a sum is the sum of the derivatives.
+          Similar rules exist for all arithmetic instructions, and the derivative of all
+          basic functions in the <code>Math</code> and <code>StrictMath</code>
is known.
+          The complete rules set is described in the <a
+          href="#Differentiation rules">Differentiation rules</a> section below.
+        </p>
+        <p>
+          So the original computation path from <code>t</code> to <code>r</code>
can be
+          expanded by conceptually replacing all single equations that constitute the
+          code by pairs of equations. The first element of each pair is a simple copy
+          of the original equation. The original computation path was fed by a
+          single double value (the function parameter <code>t</code>), but the
+          computation path needs a pair of values, one for each element in the equations
+          pair. The first element of the pair of input values will be the value of the
+          parameter, and the second will be the <em>derivative of the parameter with
+          to the free variable</em>. This means that in our case, if we want to compute
+          derivative with respect to <code>t</code>, the second element will
be the
+          derivative of <code>t</code> with respect to itself, which is simply
the constant 1.
+          In this case, we need to feed the computation path with the pair <code>(t,
+        </p>
+        <p>
+          Without changing anything to the analysis or to the expansion of the computation
+          we can also handle the case where <code>t</code> is not an independent
variable but is
+          itself a function of another free variable, i.e. a case where:
+          <pre><code>t = g(x)</code>, with <code>dt/dx = g'(x)</code></pre>
+          In this case, we would feed the computation path with the pair <code>(g(x),
+          instead of the pair <code>(t, 1)</code>. This allows to handle functions
+          including resursive calls.
+        </p>
+      </subsection>
+      <subsection name="Validity">
+        <p>
+          What is the validity of this approach?
+        </p>
+        <p>
+          For straightforward smooth functions, the expanded code really computes both the
+          value of the equation and its exact derivative. This is a simple application of
+          the differentiation rules. So the accuracy of the derivative will be in par with
+          the accuracy of the initial function. If the initial function is a good model
+          of a physical process, the derivative will be a good evaluation of its evolution.
+          If the initial function is only an approximation, the derivative will be an
+          approximation too, but <em>an approximation that is consistent with the initial
+          function up to computer accuracy</em>.
+        </p>
+        <p>
+          As soon as the initial function is not smooth, then some design choices are
+          involved which have an impact on validity. These choices have been made in
+          such a way that in some sense, the result is still as valid, as accurate and as
+          consistent with the initial function as for smooth functions.
+        </p>
+        <p>
+          First of all, what are non-smooth functions in our model, which is based on the
+          operations and functions available in the Java virtual machine? These are either
+          functions that involve calls to non-smooth functions of the <code>Math</code>
+          <code>StrictMath</code> classes near their singularity points (for
+          <code>Math.abs</code>, <code>Math.sqrt</code> or <code>Math.log</code>
near zero),
+          or functions for which the computation path includes conditional branches
+          involving computed double parameters (for example convergence loops or <code>if</code>,
+          <code>then</code>, <code>else</code> statements). This
does <em>neither</em> include
+          functions that use only unconditional jumps <em>nor</em> loops with
a number of iterations
+          not related to a computed double parameter. Such code could theoretically be
+          reorganized and loops unrolled to produce a (perhaps huge) straightforward smooth
+          computation path.
+        </p>
+        <p>
+          For singularities corresponding to domain definition boundaries (like
+          <code>Math.sqrt</code> and <code>Math.log</code> which
cannot be computed for
+          negative parameters), the theoretical derivative is defined only on the side of
+          singularity where the function itself is defined. The value of this half-derivative
+          is the limit value of the derivative when approaching the singularity. Since for
+          functions we use the expression of the derivative that is valid where the function
+          is valid, our computation is consistent with the theoretical mathematical definition.
+          For example in both the <code>Math.sqrt</code> and <code>Math.log</code>,
+          derivative is infinite at zero, with the proper sign according to the sign of the
+          input derivative.
+        </p>
+        <p>
+          For singularities not related to domain definition boundaries (like
+          <code>Math.abs</code> and conditional branches), the theoretical derivative
is not
+          defined as a single value, but as a pair of left and a right half-derivatives,
one for
+          each side of the singularity. Since there is little support in the IEEE754 standard
+          to distinguish the left and right hand side of a single value (except for zero,
+          -0 and +0 both exist), we have decided to adopt a simplified approach. These cases
+          implemented by simple conditional branches (we added explicitly such a conditional
in the
+          <code>Math.abs</code> case). Nabla then simply computes the value of
the smooth
+          derivative on the branch of the computation path that is selected at run time,
+          on the values of the input parameters. This choice allows to preserve the property
+          having a derivative that is always consistent with the associated value, and it
is a simple
+          arbitrary choice of one of the two possibilities that correspond to the mathematical
+          which by itself does not choose between them.
+        </p>
+      </subsection>
+      <subsection name="Virtual machine execution model">
+        <subsubsection name="Frame">
+        </subsubsection>
+        <subsubsection name="Bytecode instructions">
+        </subsubsection>
+      </subsection>
-    <section name="Implementation" href="implementation">
-        <subsection name="Differential pairs" href="differential-pairs">
-        </subsection>
-        <subsection name="Bytecode transforms" href="bytecode-transforms">
-        </subsection>
-        <subsection name="Complete differentiation example" href="example">
-        </subsection>
+    <section name="Implementation">
+      <subsection name="Differential pairs">
+      </subsection>
+      <subsection name="Bytecode transforms">
+      </subsection>
+      <subsection name="Differentiation rules">
+      </subsection>
+      <subsection name="Complete differentiation example">
+      </subsection>
-    <section name="Issues" href="issues">
-        <subsection name="Singularities handling" href="singularities">
-        </subsection>
-        <subsection name="Data flow and control flow analysis" href="flows-analysis">
-        </subsection>
+    <section name="Issues">
+      <subsection name="Singularities handling">
+      </subsection>
+      <subsection name="Data flow and control flow analysis">
+      </subsection>

Modified: commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml
--- commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml Wed Apr 23 14:37:08 2008
@@ -23,7 +23,7 @@
-    <section name="Public API" href="api">
+    <section name="Public API">
         Nabla public interface is very small, it is composed of only three
         interfaces and two classes.
@@ -85,7 +85,7 @@
-    <section name="Updating the base and derived objects" href="updating">
+    <section name="Updating the base and derived objects">
         One important thing to note is a consequence of the fact that the

View raw message