commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luc Maisonobe <>
Subject Re: [MATH] On computational efficiency in DerivativeStructures
Date Fri, 30 Aug 2013 07:21:28 GMT
Hi Ajo,

Le 29/08/2013 19:05, Ajo Fod a écrit :
> Hello,
> I wonder if the number of computations required to compute a derivative
> structure can be reduced. Say the derivative structure of f to order d=2 is
> desired:
> f(g1(x1), g2(x2), ... gn(xn)) where x1,x2 ... xn are subsets of variables
> that are not necessarily disjoint.
> For each gi(xi), O(n^2) variables need to be computed while I only need
> [xi]^2 different variables. (Where [xi] stands for the cardinality of the
> set xi)
> In the extreme case say x1...xn are single variables (sets of cardinality
> 1):
> To compute the derivative structure to order 2 of:
> f(sin(x1)+cos(x2)+... tan(xn))
> O(n^2) values for each of the n components needs to be computed (or at
> least memory allocated for that much):
> so that is a total of over O(n^3) values.
> If there were a way of computing only derivatives of relevant variables,
> each term of f() would require only 2 calculations or a total of 2n
> calculations.
> So, is there a way to compute the derivative of each component separately
> and "join" them together? When the derivative structure is sparse, this
> should reduce computation required from O(n^3) to O(n) ... quite a large
> saving when n is large.

It is possible, but implies doing all the "join" work by yourself.

The first step is to compute each gi function using DerivativeStructure
instances configured to one parameter only (regardless of i) and second

  DerivativeStructure x1   = new DerivativeStructure(1, 2, 0, x1Value);
  DerivativeStructure g1x1 = g1.value(x1);
  DerivativeStructure x2   = new DerivativeStructure(1, 2, 0, x2Value);
  DerivativeStructure g2x2 = g2.value(x2);
  DerivativeStructure xn   = new DerivativeStructure(1, 2, 0, xnValue);
  DerivativeStructure gnxn = gn.value(xn);

As each gi.value method is called, it gets the value for one variable
only and is not aware there are other instances lying around. It only
computes gi, dgi/dxi, d^2gi/dxi^2, so three numbers each. The number of
computations here remains linear in the number of variables.

Now comes the tricky and manual part: computing f. This method must be
aware of the full variable set for which you want partial derivatives.
You can do that by "expanding" your DerivativeStructure instances to add
0 derivatives. In the case the xi all have cardinality 1 and you want xi
to be mapped to index i-1 in the n-cardinality full set, you can use
something similar to:

 public DerivativeStructure mapGToFullSet(final int parameters,
                                          final int index,
                                          final DerivativeStructure g) {
    final int order = g.getOrder();
    final DSCompiler compiler =
      DSCompiler.getCompiler(parameters, index);
    final double[] derivatives = new double[compiler.getSize()];
    final int[] orders = new int[order + 1];

    // copy d^k g / dx^k at its dedicated index in the expanded data
    for (int k = 0; k <= order; ++k) {
      order[index] = k;
      derivatives[compiler.getPartialDerivativeIndex(orders)] =
      order[index] = O;

    return new DerivativeStructure(parameters, order, derivatives);


So you would do

  DerivativeStructure g1 = mapGToFullSet(n, 0, g1x1);
  DerivativeStructure g2 = mapGToFullSet(n, 1, g2x2);
  DerivativeStructure gn = mapGToFullSet(n, n - 1, gnxn);

And finally

  DerivativeStructure fx1xn = f.value(g1, g2, ..., gn);

The mapping part allocates n full derivatives arrays, but only
initialize very few elements in them, letting most of the elements set
to 0. It also don't compute anything, it is simply data copying. It is
probably O(n^2) for order two derivatives, but with a small coefficient.

The f.value() part is a full-blown computation of all derivatives. It
may reamin costly as is. In the general case you would save computation
on the gi, not on f. If f is really a univariate function (like in your
f(sin(x1)+cos(x2)+... tan(xn)) case), it may be possible to save a
little more. You would have to compute first y = sin(x1)+cos(x2)+...
tan(xn) with all derivatives, then build an intermediate single variable
yTmp and apply f to it (i.e. the internal computation in f would only
consider a single variable), and then combine the derivatives as follows:

 DerivativeStructure y = ...; // full sin(x1)+cos(x2)+...
 DerivativeStructure yTmp =
   new DerivativeStructure(1, 2, 0, y.getValue());
 DerivativeStructure fTmp = f.value(yTmp); // only one variable, fast!
 DerivativeStructure z = y.compose(fTmp.getAllDerivatives());
 return z;

Note that if the xi variables are not cardianlity 1 but rather a subset
of a larger set (say x1 is (a, b), x2 is (a, c), ... xn is (a, b, c)),
then the mapping function becomes more complicated. It should use the
compilers of both th initial reduced set and the complete set (in the
example above our compiler instance was only for the complete set), and
it should also create an orders derivatives array for the reducced set
(in the example above, we directly used g.getPartialDarivatives(k)
because in the special case of one parameter only, it is guaranteed that
the index and the derivation order match). The mapping function should
also use some indirection table to know for each variable x1, x2 ...
what is the reduced set that was used to compute the derivatives (in the
example above, we simply used an index argument telling the index of the
single variable in the full set). I don't know if the most general
mapping function would really be interesting or not. If you feel like
contributing such a function, I would be happy to review it.

best regards,

> Any comments on this? ... or errors in this line of though?
> Thanks,
> Ajo.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message