commons-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From l..@apache.org
Subject svn commit: r653926 - in /commons/sandbox/nabla/trunk/src/site/xdoc: index.xml internals.xml usage.xml
Date Tue, 06 May 2008 21:15:39 GMT
Author: luc
Date: Tue May  6 14:15:38 2008
New Revision: 653926

URL: http://svn.apache.org/viewvc?rev=653926&view=rev
Log:
tried to improve documentation,
added various explanations
slightly reorganized the sections
(this documentation is still work in progress)

Modified:
    commons/sandbox/nabla/trunk/src/site/xdoc/index.xml
    commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml
    commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml

Modified: commons/sandbox/nabla/trunk/src/site/xdoc/index.xml
URL: http://svn.apache.org/viewvc/commons/sandbox/nabla/trunk/src/site/xdoc/index.xml?rev=653926&r1=653925&r2=653926&view=diff
==============================================================================
--- commons/sandbox/nabla/trunk/src/site/xdoc/index.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/index.xml Tue May  6 14:15:38 2008
@@ -26,48 +26,70 @@
 
     <section name="Introduction">
       <p>
-        Nabla is an automatic differentiator for mathematical functions.
-        Just like the mathematical Nabla operator transforms a function
-        into its differential, the Nabla library transforms an existing
-        java object implementing a function
-        <code>double f(double) { ... }</code>
-        into another java object that in addition to computing the value
-        of
-        <code>f</code>
-        like the original one also computes its differential. The
-        created object implements differentiation using the classical
-        exact rules. There are no approximations and no step sizes.
+        Nabla is an automatic differentiator for mathematical
+        functions.
+      </p>
+      <p>
+        Just like the mathematical Nabla operator transforms a
+        function into its differential, the Nabla library transforms
+        an existing java object implementing a function
+        <pre><code>double f(double) { ... }</code></pre>
+        into another java object that in addition to computing the
+        value of <code>f</code> like the original one also computes
+        its derivative. The created object is built by applying the
+        classical exact differentiation rules to the function
+        underlying expressions. There are <em>no</em> approximations
+        and <em>no</em> step sizes.
       </p>
 
-      <p>This approach has the following benefits:</p>
+      <p>
+        This approach has the following benefits:
       <ul>
         <li>differentiation is exact</li>
-        <li>there are no problem-dependent step size to handle</li>
+        <li>
+          there are no problem-dependent step size to handle:
+          <ul>
+            <li>no problems of physical units</li>
+            <li>no problems of order of magnitude</li>
+            <li>no a priori knowledge of the behavior required</li>
+          </ul>
+        </li>
+        <li>there are no configuration parameters</li>
         <li>
           differentiation can be computed even at domains boundaries
         </li>
         <li>
-          there is no special handling of source (no symbolic package
-          with its own language, no source code generation, no
-          integration with the rest of application)
+          there is no special handling of source:
+          <ul>
+            <li>no symbolic package with its own language</li>
+            <li>no limitation to single self-contained expressions</li>
+            <li>no source code generation</li>
+            <li>no manual integration with the rest of application</li>
+          </ul>
         </li>
         <li>
           one writes and maintains only the basic equation and get the
           differential for free
         </li>
         <li>
+          it can be applied to already existing code without refactoring
+        </li>
+        <li>
           it is effective even when source code is not available (but
           there may be licensing issues in this case since what Nabla
           does automatically is really ... derived work)
         </li>
       </ul>
+      </p>
 
       <p>
-        The derivative instance is tightly bound to the base instance.
-        So if after Nabla has performed the transformation the base
-        instance is mutated outside of the scope of Nabla, the already
-        produced derivative instance will used this mutated state for
-        its later computation automatically.
+        The derivative instance remains tightly bound to the base
+        instance (which is refered to as its primitive) throughout its
+        lifetime. If the internal state of the primitive instance is
+        mutated after Nabla has performed the transformation, the
+        already created derivative instance which is bound to its
+        primitive will use this mutated state for its computation
+        automatically.
       </p>
 
     </section>
@@ -75,111 +97,145 @@
     <section name="Example">
 
       <p>
-        The following example should explain better what Nabla can do
-        for you.
+        The following example should explain better what Nabla can do.
       </p>
 
-      <p>
-        Let's consider a simple problem: we want to find the maximal
-        value of a function:
-        <pre><code>f(t) = (6-t)/3 + cos(4t-4) e<sup>-0.9t</sup></code></pre>
-      </p>
-      <img src="images/maximum.png"
-        alt="a curve with maximal value 2.109747 at t = 0.898775" />
-
-      <p>
-        The maximal value is reached when the first derivative of the
-        function is equal to zero. So we need to compute the first
-        derivative <code>f'(t)</code> and find its roots. We start by
-        implementing the function <code>f(t)</code>:
-      </p>
-
-      <source>
-        UnivariateDifferentiable function = new UnivariateDifferentiable() {
-            public double f(double t) {
-                return (6 - t) / 3 + Math.cos(4 * t - 4) * Math.exp(-0.9 * t);
-            }
-        };
-      </source>
-
-      <p>
-        We use the Nabla automatic differentiator to differentiate our
-        function and returning an object implementing the derivative:
-      </p>
-
-      <source>
-        UnivariateDifferentiator differentiator = new AutomaticDifferentiator();
-        final UnivariateDerivative derivative = differentiator.differentiate(function);
-      </source>
-
-      <p>
-        We get the maximal value by calling a solver on the derivative.
-        In this example, we will use the
-        <a href="http://commons.apache.org/math/apidocs/org/apache/commons/math/analysis/BrentSolver.html">
-        Brent solver</a> from the <a href="http://commons.apache.org/math/">commons
-        math</a> library. In order to do this, we wrap the derivative into an object
-        implementing the <a
-        href="http://commons.apache.org/math/apidocs/org/apache/commons/math/analysis/UnivariateRealFunction.html">
-        UnivariateRealFunction</a> interface required by all commons math solvers:
-      </p>
-
-      <source>
-        UnivariateRealFunction wrappedDerivative = new UnivariateRealFunction() {
-            public double value(double x) {
-                return derivative.f(DifferentialPair.newVariable(x)).getFirstDerivative();
-            }
-        };
-        UnivariateRealSolver solver = new BrentSolver(wrappedDerivative);
-        double tMax = solver.solve(0.5, 1.5);
-        double yMax = derivative.f(DifferentialPair.newVariable(tMax)).getValue();
-        System.out.println("max value = " + yMax + ", at t = " + tMax + " " + solver.getIterationCount());
-      </source>
-
-      <p>
-        We get the following result:
-      </p>
-
-      <source>
-        max value = 2.1097470218140537, at t = 0.8987751653383649 (7 iterations)
-      </source>
-
-      <p>
-        We could also use the derivative object to compute both the
-        value and the first derivative of our function at any point:
-      </p>
-
-      <source>
-        for (double t = 0.0; t &lt; 1.0; t += 0.01) {
-            DifferentialPair y = derivative.f(DifferentialPair.newVariable(t));
-            System.out.println(t + " " + y.getValue() + " " + y.getFirstDerivative());
-        }
-      </source>
+      <subsection name="Problem statement">
+        <p>
+          Let's consider a simple problem: we want to find the maximal
+          value of a function:
+          <pre><code>f(t) = (6-t)/3 + cos(4t-4) e<sup>-0.9t</sup></code></pre>
+        </p>
+        <img src="images/maximum.png"
+          alt="a curve with maximal value 2.109747 at t = 0.898775" />
+
+        <p>
+          The maximal value is reached when the first derivative of
+          the function is equal to zero. So we need to compute the
+          first derivative <code>f'(t)</code> and find its
+          roots. Nabla will help in the first part:
+          computing <code>f'(t)</code>.
+        </p>
+      </subsection>
+
+      <subsection name="Solution implementation">
+
+        <p>
+          The maximal value is reached when the first derivative of
+          the function is equal to zero. So we need to compute the
+          first derivative <code>f'(t)</code> and find its
+          roots. Nabla will help in the first part:
+          computing <code>f'(t)</code>. We start by implementing the
+          function <code>f(t)</code>:
+        </p>
+        <source>
+          UnivariateDifferentiable function = new UnivariateDifferentiable() {
+              public double f(double t) {
+                  return (6 - t) / 3 + Math.cos(4 * t - 4) * Math.exp(-0.9 * t);
+              }
+          };
+        </source>
+
+        <p>
+          We use the Nabla automatic differentiator to differentiate our
+          function and obtain an object implementing the derivative:
+        </p>
+
+        <source>
+          UnivariateDifferentiator differentiator = new AutomaticDifferentiator();
+          final UnivariateDerivative derivative = differentiator.differentiate(function);
+        </source>
+
+        <p>
+          We get the maximal value by calling a solver on the
+          derivative. In this example, we will use
+          the <a href="http://commons.apache.org/math/apidocs/org/apache/commons/math/analysis/BrentSolver.html">
+          Brent solver</a> from
+          the <a href="http://commons.apache.org/math/">commons-math</a>
+          library.  Functions passed to any commons-math solver must
+          implement a specific
+          interface: <a href="http://commons.apache.org/math/apidocs/org/apache/commons/math/analysis/UnivariateRealFunction.html">
+          UnivariateRealFunction</a>. In order to comply with this
+          requirement, we wrap the derivative object into another
+          object adapting the interface and pass this wrapped
+          derivative to the solver:
+        </p>
+
+        <source>
+          UnivariateRealFunction wrappedDerivative = new UnivariateRealFunction() {
+              public double value(double x) {
+                  DifferentialPair t = DifferentialPair.newVariable(x);
+                  return derivative.f(t).getFirstDerivative();
+              }
+          };
+          UnivariateRealSolver solver = new BrentSolver(wrappedDerivative);
+          double tMax = solver.solve(0.5, 1.5);
+          double yMax = derivative.f(DifferentialPair.newVariable(tMax)).getValue();
+          System.out.println("max value = " + yMax + ", at t = " + tMax +
+                             " (" + solver.getIterationCount() + " iterations)");
+        </source>
+
+        <p>
+          We get the following result:
+        </p>
+
+        <source>
+          max value = 2.1097470218140537, at t = 0.8987751653383649 (7 iterations)
+        </source>
+
+      </subsection>
+
+      <subsection name="Discussion">
+        <p>
+          This example shows that Nabla creates an object that
+          computes both the value and the derivative of a function,
+          given only an instance of a class that computes the
+          primitive function. Despite we had the source code available
+          in this case, it was not used: transformation is done at
+          runtime using only an instance of the primitive function. We
+          can also observe that there is no configuration at all: the
+          automatic differentiator is built using a no-argument
+          constructor and the differentiation method has only the
+          primitive object as a parameter.
+        </p>
+
+        <p>
+          We could also use the derivative object to compute both the
+          value and the first derivative of our function at any point:
+        </p>
+
+        <source>
+          for (double t = 0.0; t &lt; 1.0; t += 0.01) {
+              DifferentialPair y = derivative.f(DifferentialPair.newVariable(t));
+              System.out.println(t + " " + y.getValue() + " " + y.getFirstDerivative());
+          }
+        </source>
 
+      </subsection>
     </section>
 
-    <section name="How ?">
-
-      <p>
-        The previous example shows that Nabla creates an object that
-        computes both the value and the derivative of a function, given
-        only an instance of a class that computes the base function. How
-        does it do that?
-      </p>
+    <section name="Limitations">
 
       <p>
-        Basically, Nabla works by analyzing the bytecode of the original
-        object, transforming the arithmetic operations along all
-        possible execution passes using the differentiation, generating
-        a new class on the fly with the transformed bytecode, and
-        instantiating it to generate the derivative object. This works
-        for any pure Java function, or more generally for any program
-        using the Java platform as its execution environment.
+        Basically, Nabla works by analyzing the bytecode of the
+        original object, transforming the arithmetic operations along
+        all possible execution passes using the differentiation,
+        generating a new class on the fly with the transformed
+        bytecode, and instantiating it to generate the derivative
+        object. This works for any pure Java function, or more
+        generally for any program using the Java platform as its
+        execution environment.
       </p>
 
       <p>
-        The main drawback is that functions that call native code cannot
-        be handled. For these functions, a work around is provided that
-        uses finite differences (with 2, 4, 6 or 8 points schemes).
+        The main drawback is that functions that call native code
+        cannot be handled. For these functions, a fallback method is
+        provided that uses finite differences (with 2, 4, 6 or 8
+        points schemes). This fallback method does not have the same
+        advantages as the previous one: it needs configuration (number
+        of points and step size), it is not exact, it is more
+        computing intensive and it cannot be used too close to domain
+        boundaries.
       </p>
 
     </section>

Modified: commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml
URL: http://svn.apache.org/viewvc/commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml?rev=653926&r1=653925&r2=653926&view=diff
==============================================================================
--- commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/internals.xml Tue May  6 14:15:38 2008
@@ -25,132 +25,224 @@
   <body>
     <section name="Principles">
 
-      <subsection name="Symbolic differentiation">
+      <subsection name="Differentiation at bytecode instructions level">
         <p>
-          Nabla computes the derivatives using symbolic differentiation. Considering
-          a typical call to a univariate function:
-          <pre><code>double r = f(t);</code></pre>
-          Nabla tracks the data flow that leads from the <code>t</code> parameter to the
-          <code>r</code> result to understand how this result is computed. All the
-          operations leading from <code>t</code> to <code>r</code> belong to a small set
-          of instructions corresponding to the capabilities of the virtual machine. This
-          set contains basic arithmetic operations (addition, subtraction ...), conversion
-          operations (double to int, long to double ...), storage instructions (local
-          variables, functions parameters, instance or class fields ...) and calls to
-          elementary functions defined in the <code>Math</code> and <code>StrictMath</code>
-          classes. There is really nothing more!
-        </p>
-        <p>
-          For each one of these basic computer instruction, we know how to map it to a
-          mathematical equation and we can combine this equation with its derivative to
-          form a pair of equations we will use later. For example, a <code>DADD</code>
-          bytecode instruction corresponds to the addition of two real numbers and
-          produces a third number which is their sum. So we map the instruction to the
-          equation:
+          Nabla computes the derivatives applying the classical
+          differentiation rules at bytecode level. When an instance
+          of a class implementing <code>UnivariateDifferentiable</code>
+          is passed to its <code>differentiate</code> method, Nabla
+          tracks the mathematical operations flow that leads from the
+          <code>t</code> parameter to the return value of the
+          function. At the bytecode instructions level, the operations
+          are elementary ones. Each elementary operation is then
+          changed to compute both a value and a derivative. Nothing is
+          changed to the control flow instructions (loops, branches,
+          operations scheduling).
+        </p>
+
+        <p>
+          All the changed operations belong to a small subset of the
+          virtual machine instructions. This set contains basic arithmetic
+          operations (addition, subtraction ...), conversion operations
+          (double to int, long to double ...), storage instructions (local
+          variables, functions parameters, instance or class fields ...)
+          and calls to elementary functions defined in the <code>Math</code>
+          and <code>StrictMath</code> classes. There is really nothing more!
+          For each one of these basic bytecode instructions, we know how to
+          map it to a mathematical equation and we can combine this equation
+          with its derivative to form a pair of equations we will use later.
+          For example, a <code>DADD</code> bytecode instruction corresponds
+          to the addition of two real numbers and produces a third number
+          which is their sum. So we map the instruction to the equation:
           <pre><code>c=a+b</code></pre>
           and we combine this with its derivative to form the pair:
           <pre><code>(c=a+b, c'=a'+b')</code></pre>
-          In this example, we have simply used the linearity property of differentiation
-          which implies that the derivative of a sum is the sum of the derivatives.
-          Similar rules exist for all arithmetic instructions, and the derivative of all
-          basic functions in the <code>Math</code> and <code>StrictMath</code> is known.
-          The complete rules set is described in the <a
-          href="#Differentiation rules">Differentiation rules</a> section below.
-        </p>
-        <p>
-          So the original computation path from <code>t</code> to <code>r</code> can be
-          expanded by conceptually replacing all single equations that constitute the
-          code by pairs of equations. The first element of each pair is a simple copy
-          of the original equation. The original computation path was fed by a
-          single double value (the function parameter <code>t</code>), but the expanded
-          computation path needs a pair of values, one for each element in the equations
-          pair. The first element of the pair of input values will be the value of the
-          parameter, and the second will be the <em>derivative of the parameter with respect
-          to the free variable</em>. This means that in our case, if we want to compute the
-          derivative with respect to <code>t</code>, the second element will be the
-          derivative of <code>t</code> with respect to itself, which is simply the constant 1.
-          In this case, we need to feed the computation path with the pair <code>(t, 1)</code>.
-        </p>
-        <p>
-          Without changing anything to the analysis or to the expansion of the computation path,
-          we can also handle the case where <code>t</code> is not an independent variable but is
-          itself a function of another free variable, i.e. a case where:
-          <pre><code>t = g(x)</code>, with <code>dt/dx = g'(x)</code></pre>
-          In this case, we would feed the computation path with the pair <code>(g(x), g'(x))</code>
-          instead of the pair <code>(t, 1)</code>. This allows to handle functions composition,
-          including resursive calls.
+          In this example, we have simply used the linearity property of
+          differentiation which implies that the derivative of a sum is the
+          sum of the derivatives. Similar rules exist for all arithmetic
+          instructions, and the derivative of all basic functions in the
+          <code>Math</code> and <code>StrictMath</code> is known. The
+          complete rules set is described in the <a
+          href="#Differentiation_rules">Differentiation rules</a> section
+          below.
+        </p>
+
+        <p>
+          Lets consider a more extensive example:
         </p>
+
+        <source>
+          public class Linear implements UnivariateDifferentiable {
+              public double f(double t) {
+                  double result = 1;
+                  for (int i = 0; i &lt; 3; ++i) {
+                      result = result * (2 * t + 1);
+                  }
+                  return result;
+              }
+          }
+        </source>
+
+        <p>
+          In this example, the only things that need to be changed for
+          differentiating the <code>f</code> method are
+          the <code>t</code> parameter and the <code>result</code>
+          local variable which must be adapted to hold the value and
+          the derivative, the two multiplications and the addition. In
+          order to generate the new function, Nabla will convert these
+          elements one at a time, starting from the parameter change
+          and propagating the change using a simple data flow
+          analysis. There is no need to analyze the global structure
+          of the code, and no need to change anything in the loop
+          handling instructions above. The method generated for the
+          example above will be roughly similar to the one that would
+          result from compiling the code below (except that Nabla
+          generates bytecode directly, it does not use source at all):
+        </p>
+
+        <source>
+          public DifferentialPair f(DifferentialPair t) {
+
+              // source roughly equivalent to code generated at method entry
+              double t0 = t.getValue();
+              double t1 = t.getFirstDerivative();
+
+              // source roughly equivalent to conversion of the result affectation
+              double result0  = 1;
+              double result1  = 0;
+
+              // this loop handling code is not changed at all
+              for (int i = 0; i &lt; 3; ++i) {
+
+                  // source roughly equivalent to conversion of "2 * ..."
+                  double tmpA0 = 2 * t0;
+                  double tmpA1 = 2 * t1;
+
+                  // source roughly equivalent to conversion of "... + 1"
+                  tmpA0 += 1;
+
+                  // source roughly equivalent to conversion of "result * ..."
+                  double tmpB0 = result0 * tmpA0;
+                  double tmpB1 = result0 * tmpA1 + result1 * tmpA0;
+
+                  // source roughly equivalent to conversion of "result = ..."
+                  result0 = tmpB0;
+                  result1 = tmpB1;
+
+              }
+
+              // source equivalent to code generated at method exit
+              return new DifferentialPair(result0, result1);
+
+          }
+        </source>
+
+        <p>
+          This example shows that the instructions conversions have local
+          scope. No global tree representation of the method is needed
+          at all.
+        </p>
+
+      </subsection>
+
+      <subsection name="No intermediate DifferentialPair instances">
+
+        <p>
+          Another thing that is shown in the previous example is that
+          <code>DifferentialPair</code> instances appear only at method
+          entry and exit. They are not used internally for elementary
+          conversions, only pairs of local variables (or operand stack
+          cells as we will see later) are used.
+        </p>
+
+        <p>
+          The only methods from the <code>DifferentialPair</code> class that
+          are used by the automatic differentiator are the two arguments
+          constructor, the <code>getValue</code> method and the
+          <code>getFirstDerivative</code> method.
+        </p>
+
+        <p>
+          For differentiable functions that call other differentiable functions,
+          additional instances of <code>DifferentialPair</code> will be used.
+          Some will be built in the caller to pass parameters to the callee, and
+          others will be build in the callee to return results to the caller.
+          This is <em>not</em> the case for calls to the elementary mathematical
+          functions defined in the <code>Math</code> and <code>StrictMath</code>
+          classes. For the known functions the derivative computation are inlined.
+          For example a call to the <code>Math.cos</code> function will be inlined
+          as a call to <code>Math.cos</code>, a call to <code>Math.sin</code>
+          and some intermediate arithmetic operations.
+        </p>
+
+      </subsection>
+
+      <subsection name="Data flow analysis">
+
+        <p>
+          As shown in the example above, the original method bytecode
+          contains both immutable parts that must be preserved and
+          parts belonging to what we will call the <em>computation path</em>
+          from the <code>t</code> parameter to the result that must
+          be differentiated. The instructions that must be converted are
+          identified by a data flow analysis seeded with the <code>t</code>
+          parameter, which is changed from a primitive <code>double</code>
+          in the original method to a <code>DifferentialPair</code> in the
+          generated derivative, as explained in the <a
+          href="usage.html#Differential_pairs">Differential pairs</a> section
+          of the usage documentation.
+        </p>
+
+        <p><strong>
+          TODO: explain analysis by "tainted" data and types propagation.
+        </strong></p>
+
       </subsection>
 
       <subsection name="Validity">
+
         <p>
           What is the validity of this approach?
         </p>
+
         <p>
-          For straightforward smooth functions, the expanded code really computes both the
-          value of the equation and its exact derivative. This is a simple application of
-          the differentiation rules. So the accuracy of the derivative will be in par with
-          the accuracy of the initial function. If the initial function is a good model
-          of a physical process, the derivative will be a good evaluation of its evolution.
-          If the initial function is only an approximation, the derivative will be an
-          approximation too, but <em>an approximation that is consistent with the initial
-          function up to computer accuracy</em>.
-        </p>
-        <p>
-          As soon as the initial function is not smooth, then some design choices are
-          involved which have an impact on validity. These choices have been made in
-          such a way that in some sense, the result is still as valid, as accurate and as
-          consistent with the initial function as for smooth functions.
-        </p>
-        <p>
-          First of all, what are non-smooth functions in our model, which is based on the
-          operations and functions available in the Java virtual machine? These are either
-          functions that involve calls to non-smooth functions of the <code>Math</code> and
-          <code>StrictMath</code> classes near their singularity points (for example
-          <code>Math.abs</code>, <code>Math.sqrt</code> or <code>Math.log</code> near zero),
-          or functions for which the computation path includes conditional branches
-          involving computed double parameters (for example convergence loops or <code>if</code>,
-          <code>then</code>, <code>else</code> statements). This does <em>neither</em> include
-          functions that use only unconditional jumps <em>nor</em> loops with a number of iterations
-          not related to a computed double parameter. Such code could theoretically be
-          reorganized and loops unrolled to produce a (perhaps huge) straightforward smooth
-          computation path.
-        </p>
-        <p>
-          For singularities corresponding to domain definition boundaries (like
-          <code>Math.sqrt</code> and <code>Math.log</code> which cannot be computed for
-          negative parameters), the theoretical derivative is defined only on the side of the
-          singularity where the function itself is defined. The value of this half-derivative
-          is the limit value of the derivative when approaching the singularity. Since for these
-          functions we use the expression of the derivative that is valid where the function
-          is valid, our computation is consistent with the theoretical mathematical definition.
-          For example in both the <code>Math.sqrt</code> and <code>Math.log</code>, the
-          derivative is infinite at zero, with the proper sign according to the sign of the
-          input derivative.
-        </p>
-        <p>
-          For singularities not related to domain definition boundaries (like
-          <code>Math.abs</code> and conditional branches), the theoretical derivative is not
-          defined as a single value, but as a pair of left and a right half-derivatives, one for
-          each side of the singularity. Since there is little support in the IEEE754 standard
-          to distinguish the left and right hand side of a single value (except for zero, since
-          -0 and +0 both exist), we have decided to adopt a simplified approach. These cases are
-          implemented by simple conditional branches (we added explicitly such a conditional in the
-          <code>Math.abs</code> case). Nabla then simply computes the value of the smooth
-          derivative on the branch of the computation path that is selected at run time, depending
-          on the values of the input parameters. This choice allows to preserve the property of
-          having a derivative that is always consistent with the associated value, and it is a simple
-          arbitrary choice of one of the two possibilities that correspond to the mathematical result,
-          which by itself does not choose between them.
+          For straightforward smooth functions, the expanded code
+          really computes both the value of the equation and its exact
+          derivative. This is a simple application of the
+          differentiation rules. So the accuracy of the derivative
+          will be in par with the accuracy of the initial function. If
+          the initial function is a good model of a physical process,
+          the derivative will be a good evaluation of its evolution.
+          If the initial function is only an approximation of a real
+          physical model, then the derivative will be an approximation
+          too, but <em>an approximation that is consistent with the
+          initial function up to computer accuracy</em>.
         </p>
+
+        <p>
+          If the initial function is not smooth, the singular points
+          must be analyzed specially. Some design choices are involved
+          which have an impact on validity.  These choices have been
+          made in such a way that in some sense, the result is still
+          as valid, as accurate and as consistent with the initial
+          function as for smooth functions. This point is explained in
+          details in the section
+          about <a href="singularities.html">singularities</a>.
+        </p>
+
       </subsection>
 
       <subsection name="Virtual machine execution model">
-        <subsubsection name="Frame">
-        </subsubsection>
-
-        <subsubsection name="Bytecode instructions">
-        </subsubsection>
+          <p><strong>
+              TODO: explain local variables and stack cells.
+          </strong></p>
+
+          <p><strong>
+              TODO: explain the various subsets using the same categories
+              as in the first section above (basic arithmetic operations,
+              conversion operations, storage instructions and calls to elementary
+              functions).
+          </strong></p>
       </subsection>
 
     </section>
@@ -158,27 +250,34 @@
     <section name="Implementation">
 
       <subsection name="Differential pairs">
+          <p><strong>
+              TODO: explain that the various methods provided by DifferentialPair
+              are convenience methods for end users. They can be used to perform
+              some additional transforms on already derived code. BTW, is this feature
+              really useful ?
+          </strong></p>
       </subsection>
 
       <subsection name="Bytecode transforms">
+          <p><strong>
+              TODO
+          </strong></p>
       </subsection>
 
       <subsection name="Differentiation rules">
+           <p><strong>
+               TODO: put <em>all</em> rules in a table for reference.
+           </strong></p>
+           </subsubsection>
       </subsection>
 
       <subsection name="Complete differentiation example">
+          <p><strong>
+            TODO: step by step analysis of the initial example.
+          </strong></p>
       </subsection>
 
     </section>
 
-    <section name="Issues">
-
-      <subsection name="Singularities handling">
-      </subsection>
-
-      <subsection name="Data flow and control flow analysis">
-      </subsection>
-
-    </section>
   </body>
 </document>

Modified: commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml
URL: http://svn.apache.org/viewvc/commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml?rev=653926&r1=653925&r2=653926&view=diff
==============================================================================
--- commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml (original)
+++ commons/sandbox/nabla/trunk/src/site/xdoc/usage.xml Tue May  6 14:15:38 2008
@@ -23,93 +23,272 @@
   </properties>
 
   <body>
-    <section name="Public API">
-      <p>
-        Nabla public interface is very small, it is composed of only three
-        interfaces and two classes.
-        <dl>
-          <dt>UnivariateDifferentiable</dt>
-          <dd>is the interface representing the mathematical function that should
-          be differentiated. The user-defined class are provided to Nabla as
-          classes implementing this interface</dd>
-          <dt>UnivariateDerived</dt>
-          <dd>is the interface representing the derived function created by
-          Nabla. It is the result of the differentiation.</dd>
-          <dt>UnivariateDifferentiator</dt>
-          <dd>is the interface implemented by the various differentiators provided
-          by Nabla.</dd>
-          <dt>DifferentialPair</dt>
-          <dd>is a simple container holding both a value and a derivative. It can be
-          considered simply as an <em>enhanced double</em> that is used both as
-          the type of input parameters and return value of derived functions.</dd>
-          <dt>AutomaticDifferentiator</dt>
-          <dd>is the main implementation of the UnivariateDifferentiator interface.
-          It performs differentiation by bytecode analysis and generation, using
-          the exact differentiation rules in order to create functions that
-          compute exact differentials.</dd>
+    <section name="Basic usage">
+      <subsection name="Public API">
+        <p>
+          Nabla public interface is very small, it is composed of only
+          three interfaces and two classes.
+          <dl>
+            <dt><a href="apidocs/org/apache/commons/nabla/core/UnivariateDifferentiable.html">UnivariateDifferentiable</a></dt>
+            <dd>is the interface representing the mathematical
+            function that should be differentiated. The user-defined
+            class are provided to Nabla as classes implementing this
+            interface</dd>
+          <dt><a href="apidocs/org/apache/commons/nabla/core/UnivariateDerivative.html">UnivariateDerivative</a></dt>
+          <dd>is the interface representing the differentiated
+          function created by Nabla.</dd>
+          <dt><a href="apidocs/org/apache/commons/nabla/core/UnivariateDifferentiator.html">UnivariateDifferentiator</a></dt>
+          <dd>is the interface implemented by the various
+          differentiators provided by Nabla.</dd>
+          <dt><a href="apidocs/org/apache/commons/nabla/core/DifferentialPair.html">DifferentialPair</a></dt>
+          <dd>is a simple container holding both a value and a
+          derivative. It can be considered simply as an <em>enhanced
+          double</em> that is used both as the type of input
+          parameters and return value of differentiated
+          functions.</dd>
+          <dt><a href="apidocs/org/apache/commons/nabla/automatic/AutomaticDifferentiator.html">AutomaticDifferentiator</a></dt>
+          <dd>is the main implementation of the
+          UnivariateDifferentiator interface.  It performs
+          differentiation by bytecode analysis and generation, using
+          the exact differentiation rules in order to create functions
+          that compute exact differentials.</dd>
         </dl>
       </p>
 
       <img src="images/public-API.png" alt="UML class diagram of Nabla public API" />
+      </subsection>
 
-      <p>
-        In order to use differentiate a function using Nabla, the function must be
-        provided as an implementation of the UnivariateDifferentiable interface and
-        passed as the single parameter of the <code>differentiate</code> method of an
-        AutomaticDifferentiator instance. If the existing class does not already
-        implements the UnivariateDifferentiable interface, it has to be wrapped when
-        provided to the differentiator.
-      </p>
-
-      <p>As an example, consider the following problem. We have a <code>model</code>
-      variable which is an instance of a class with a method <code>evaluate</code>:
-      <pre><code>double evaluate(double first, double second)</code></pre>
-      and we want to compute its partial derivative with respect to the second
-      parameter, when the first parameter value is 2.5 and the second parameter
-      ranges from -1 to +1. Here is a way to do that:
-      </p>
+      <subsection name="Differentiating user-defined functions">
+        <p>
+          In order to differentiate a function using Nabla, a
+          user-defined function must be provided as an implementation
+          of the UnivariateDifferentiable interface. It is passed as
+          the single parameter to the <code>differentiate</code>
+          method of an AutomaticDifferentiator instance. If the
+          existing class does not already implements the
+          UnivariateDifferentiable interface, it has to be wrapped
+          when provided to the differentiator.
+        </p>
+
+        <p>
+          As an example, consider the following problem. We have
+          a <code>model</code> variable which is an instance of a
+          class with a method <code>evaluate</code>:
+          <pre><code>double evaluate(double first, double second)</code></pre>
+          We want to compute its partial derivative with respect to
+          the second parameter, when the first parameter value is 2.5
+          and the second parameter ranges from -1 to +1. Here is a way
+          to do that:
+        </p>
+
+        <source>
+        UnivariateDerivative derivative =
+            new AutomaticDifferentiator().differentiate(new UnivariateDifferentiable() {
+                public double f(double t) {
+                    return model.evaluate(2.5, t);
+                }
+            });
+        </source>
+      </subsection>
+
+      <subsection name="Differential pairs">
+
+        <p>
+         The derivative instance created by Nabla provides a
+         method <code>f</code> which computes both the value of the
+         primitive function (as the initial instance does) and the
+         value of its first derivative with respect to its input
+         parameter. This method has the following signature:
+         <pre><code>DifferentialPair f(DifferentialPair t)</code></pre>
+        </p>
+
+       <p>
+         It is important to note this method does not
+         use <code>double</code> as its input parameter
+         but <code>DifferentialPair</code>. This allows to handle both
+         the basic usage when one wants to compute the derivative of
+         the <code>f</code> function when its parameter is the
+         independent variable, but also more advanced cases when the
+         parameter <code>t</code> is not an independent variable.
+       </p>
+
+       <p>Let's say <code>t</code> is computed by another
+         function <code>g</code> from a variable <code>x</code>
+         (<code>t = g(x)</code>). The signature of the derivative
+         function produced by Nabla allows to chain calls and compute
+         the composite derivative.  We compute
+         first <code>dt/dx</code> from the derivative
+         function <code>g'(x)</code>, and then pass the resulting
+         differential pair to the derivative
+         function <code>f'(t)</code> in order to get the
+         result <code>dy/dx</code>.
+       </p>
+
+       <source>
+         DifferentialPair dTdX = gDerivative(DifferentialPair.newVariable(2.0));
+         DifferentialPair dYdX = fDerivative(dTdX);
+       </source>
+
+       <p>
+         Simple derivatives are computed by passing independent
+         variables to the derivative functions. An independent
+         variable is represented by a <code>DifferentialPair</code>
+         with a first derivative set to 1. Composite derivatives are
+         computed by passing dependent variables to the derivative
+         functions. A dependent variable is represented by
+         a <code>DifferentialPair</code> which has been computed by a
+         previous derivative function and which may have any value as
+         its first derivative. A constant may be represented by
+         a <code>DifferentialPair</code> with a first derivative set
+         to 0.
+       </p>
+
+       <p>
+         The <code>DifferentialPair</code> class provides a general
+         constructor to build an instance from its value and
+         derivative. For convenience, it also provides factory methods
+         to build independent variables and constants:
+       </p>
+
+       <source>
+       // value = 1.0, first derivative = 2.0
+       DifferentialPair aPair = new DifferentialPair(1.0, 2.0);
+
+       // this is equivalent to new DifferentialPair(3.5, 1.0);
+       DifferentialPair variable = DifferentialPair.newVariable(3.5);
+
+       // this is equivalent to new DifferentialPair(Math.PI, 0.0)
+       DifferentialPair constant = DifferentialPair.newConstant(Math.PI);
+       </source>
 
-      <source>
-      final double constantFirst = 2.5;
-      UnivariateDerived derived =
-          new AutomaticDifferentiator().differentiate(new UnivariateDifferentiable() {
-              public double f(double t) {
-                  return model.evaluate(constantFirst, t);
-              }
-          });
-      for (double second = -1; second &lt;= +1; second += 0.01) {
-        DifferentialPair t = DifferentialPair.newVariable(second);
-        System.out.println(second + " " + derived.f(t).getFirstDerivative());
-      } 
-      </source>
+      </subsection>
 
     </section>
 
-    <section name="Updating the base and derived objects">
+    <section name="Advanced use">
+      <subsection name="Updating the base and differentiated objects">
 
-      <p>
-        One important thing to note is a consequence of the fact that the
-        <code>differentiate</code> method returns a new object when called.
-        This implies that we end up with two different instances of two
-        different classes that compute roughly similar things: the original
-        instance and the newly created object. If the implementation of the
-        <code>f</code> method does use some attribute of the original class,
-        then the class of the newly created object should also provide a way
-        to get this value.
-      </p>
+        <p>
+          One important thing to note is a consequence of the fact that the
+          <code>differentiate</code> method returns a new object when
+          called. This implies that we end up with two different
+          instances of two different classes that compute roughly
+          similar things: the original instance and the newly created
+          object. If the implementation of the <code>f</code> method
+          does use some attribute of the original class, then the class
+          of the newly created object should also provide a way to get
+          this value.
+        </p>
+
+        <p>
+          An important design choice in Nabla is that the newly
+          created instance does <em>not</em> copy the state of the
+          original object at derivation time, but instead is
+          permanently and tightly linked to this original instance and
+          uses it to get the values it needs when it needs them (even
+          if they are stored in private attributes). A direct
+          implication is that if the state of the original object is
+          changed <em>after</em> differentiation, all subsequent calls
+          to the <code>f</code> method of the already created
+          differentiated instance will reflect these changes in their
+          behavior. There is no need to bother about updating the
+          differentiated instance, it is already up-to-date.
+        </p>
+
+        <p>
+          As an example, consider again the problem above, where we
+          wanted the derivative of a model with respect to its second
+          parameter. Now we want to compute the same derivative as
+          previously but we also want to be able to change the value
+          of the first parameter, instead of sticking to the value
+          2.5. Here is a way to do this:
+        </p>
+ 
+        <source>
+        public class SetableFirstParameterModel implements UnivariateDifferentiable {
+
+            private Model model;
+            private double firstParameter;
+
+            public SetableFirstParameterModel(Model model, double firstParameter) {
+                this.model = model;
+                this.firstParameter = firstParameter;
+            }
+
+            public void setFirstParameter(double firstParameter) {
+                this.firstParameter = firstParameter;
+            }
+
+            public double f(double t) {
+                return model.evaluate(firstParameter, t);
+            }
+
+        }
+        </source>
+
+        <p>
+          When we build the derivative of an instance of this class,
+          this derivative will keep a reference to its primitive in
+          order to access the <code>firstParameter</code> private
+          field. If this field is changed on the primitive instance by
+          calling the <code>setFirstParameter</code> method, the
+          derivative will see the change immediately.
+        </p>
+
+        <source>
+        SetableFirstParameterModel setable = new SetableFirstParameterModel(model, 2.5);
+        UnivariateDerivative derivative = new AutomaticDifferentiator().differentiate(setable);
+        DifferentialPair t = DifferentialPair.newVariable(2.0);
+
+        // derivative with respect to second parameter when first parameter equals 2.5
+        double der25 = derivative.f(t).getFirstDerivative();
+
+        // derivative with respect to second parameter when first parameter equals 3.0
+        setable.setFirstParameter(3.0);
+        double der30 = derivative.f(t).getFirstDerivative();
+        </source>
+
+      </subsection>
+
+      <subsection name="Functions calling native code">
+
+        <p>
+          Since the automatic differentiator can analyze only bytecode,
+          functions calling native code cannot be handled this way by
+          Nabla. There is a fallback procedure using finite differences.
+          The <code>differences</code> package provides several classes
+          that implement the <code>UnivariateDifferentiator</code> interface:
+          <code><a href="apidocs/org/apache/commons/nabla/differences/TwoPointsScheme.html">TwoPointsScheme</a></code>,
+          <code><a href="apidocs/org/apache/commons/nabla/differences/FourPointsScheme.html">FourPointsScheme</a></code>,
+          <code><a href="apidocs/org/apache/commons/nabla/differences/SixPointsScheme.html">SixPointsScheme</a></code> and
+          <code><a href="apidocs/org/apache/commons/nabla/differences/EightPointsScheme.html">EightPointsScheme</a></code>.
+        </p>
+
+        <p>
+          These classes need a step size at construction time. For each call
+          to the derivative instance <code>f</code> method, they will call the
+          <code>f</code> of the primitive instance <code>n+1</code> times where
+          <code>n</code> is the number of points of the method. One evaluation
+          is at the location defined by the parameter and the <code>n</code>
+          other evaluations are regularly distributed around it. So the two points
+          method will evaluate the primitive function 3 times at <code>t-h</code>,
+          <code>t</code> and <code>t+h</code> while the eight points method will
+          evaluate it 9 times at <code>t-4h</code>, <code>t-3h</code> ...
+          <code>t+4h</code> where <code>h</code> is the step size.
+        </p>
+
+        <p>
+          The step size and the number of points must be chosen with care as they
+          influence both the accuracy of the result (which is only an approximation)
+          and the computational cost. Small step size improve theoretical accuracy
+          up to the point where numerical cancellations due to the finite precision
+          of double numbers exceed the theoretical error due to finite differences
+          modeling. Large number of points improve the accuracy but imply a large
+          number of functions evaluation which can become prohibitive. There is no
+          best choice that fits all needs, the right choice is problem-dependent. 
+        </p>
 
-      <p>
-        An important design choice in Nabla is that the newly created instance
-        does <em>not</em> copy the state of the original object at derivation
-        time, but instead is permanently and tightly linked to this original
-        instance and uses it to get the values it needs when it needs them
-        (even if it is stored in a private attribute). A direct implication is
-        that if the state of the original object is changed <em>after</em>
-        differentiation, all subsequent calls to the <code>f</code> method of
-        the already created differentiated instance will reflect these changes
-        in their behavior. There is no need to bother about updating the
-        differentiated instance, it is already up-to-date.
-      </p>
+      </subsection>
 
     </section>
 



Mime
View raw message