Section: Scientific Foundations
Automatic Differentiation
Participants : Laurent Hascoët, Valérie Pascual.
- Glossary
- automatic differentiation
(AD) Automatic transformation of a program, that returns a new program that computes some derivatives of the given initial program, i.e. some combination of the partial derivatives of the program's outputs with respect to its inputs.
- adjoint model
Mathematical manipulation of the Partial Derivative Equations that define a problem, obtaining new differential equations that define the gradient of the original problem's solution.
- checkpointing
General trade-off technique, used in the reverse mode of AD, that trades duplicate execution of a part of the program to save some memory space that was used to save intermediate results. Checkpointing a code fragment amounts to running this fragment without any storage of intermediate values, thus saving memory space. Later, when such an intermediate value is required, the fragment is run a second time to obtain the required values.
Automatic or Algorithmic Differentiation (AD) differentiates
programs. An AD tool takes as input
a source computer program
For any given control,
where each
which can be mechanically written as a sequence of instructions
In practice, the Jacobian
-
Sensitivities, defined for a given direction
in the input space as:Sensitivities are easily computed from right to left, interleaved with the original program instructions. This is the tangent mode of AD.
-
Adjoints, defined for a given weighting
of the outputs as:Adjoints are most efficiently computed from right to left, because matrix
vector products are cheaper than matrix matrix products. This is the reverse mode of AD, most effectiev for optimization, data assimilation [28] , adjoint problems [23] , or inverse problems.
The reverse mode turns out to make a very efficient program, at least
theoretically [25] . The computation time
required for the gradient is only a small multiple of the run-time of
However, we observe that the
Another research issue is to make the AD model cope with the constant evolution of modern language constructs. From the old days of Fortran77, novelties include pointers and dynamic allocation, modularity, structured data types, objects, vectorial notation and parallel communication. We regularly extend our models and tools to handle these new constructs.