\documentclass[oneside,twocolumn]{book}
\usepackage{amsfonts}

\usepackage{anysize}
%\marginsize{2.3cm}{2.3cm}{2.3cm}{2.3cm}

\usepackage{graphicx}
%\graphicspath{{scans/}}

\usepackage[small]{caption}

\def\rsubsection#1{{\raggedright\subsection*{#1}}}
\def\rsectionshort#1#2{{\raggedright\section[#1]{#2}}}
\def\rsection#1{{\raggedright\section{#1}}}
\def\references{\section*{References}}
\raggedbottom

% Width of figures; to fill one column
\def\picturewidth{7.5cm}
\def\widepicturewidth{15cm}

\def\plot#1#2{
	\begin{figure}[tb] \begin{center}
	\fbox{
	\includegraphics[width=\picturewidth,keepaspectratio,clip]{#1}
	}
	\caption{#2}
	\label{#1}
	\end{center} \end{figure}
}

\def\wideplot#1#2{
	\begin{figure*}[tb] \begin{center}
	\fbox{
	\includegraphics[width=\widepicturewidth,keepaspectratio,clip]{#1}
	}
	\caption{#2}
	\label{#1}
	\end{center} \end{figure*}
}

\hyphenation{anti-auto-morph-ism}
\hyphenation{semi-group}
\hyphenation{sand-wich}

\def\R{{\mathbb R}} 
\def\tr{\mathop{\rm tr}}
\def\fix{\mathop{\rm fix}}
\def\phi{\varphi}

\newcommand{\QED}{\hspace*{2em}\hfill$\bullet$}
\newtheorem{theorem}{Theorem}
\newtheorem{proposition}[theorem]{Proposition}
\newtheorem{definition}[theorem]{Definition}
\newtheorem{example}[theorem]{Example}
\newtheorem{lemma}[theorem]{Lemma}
\def\be{\begin{enumerate}}
\def\ee{\end{enumerate}}

\newenvironment{proof}[1]{\vspace{-2pt}{\em Proof\/}\quad #1}{\QED\vspace{8pt}}

\def\eqalign#1{\null\,\vcenter{\openup\jot \mathsurround=0pt \ialign{\strut
	\hfil$\displaystyle{##}$&$ \displaystyle{{}##}$\hfil \crcr#1\crcr}}\,}

\def\smallmatrix{\null\,\vcenter\bgroup \baselineskip=7pt \ialign\bgroup
	\hfil$\scriptstyle{##}$\hfil&&$\;$\hfil $\scriptstyle{##}$\hfil\crcr}
\def\endsmallmatrix{\crcr\egroup\egroup\,} 
\def\smatrix#1{\smallmatrix #1 \endsmallmatrix}
\def\spmatrix#1{\big(\smallmatrix #1 \endsmallmatrix \big)}

\def\O{{\cal O}}
\def\N{{\cal N}}
\def\I{{\cal I}}
\def\A{{\cal A}}
\def\Ap{{\cal A}_+}
\def\K{{\cal K}}
\def\id{{\rm id}}
\def\DG{\overline{\nabla}}

\begin{document}

\title{Six Lectures on the Geometric Integration of ODEs}
\author{Robert McLachlan and Reinout Quispel}
\date{July 1998}

\maketitle

\tableofcontents

\chapter{General tools}

{\narrower\small\em \noindent``A major task of mathematics is to harmonize the
continuous and the discrete, to include them in one
comprehensive mathematics, and to eliminate obscurity from both.''
\hfil (E.T. Bell, Men of Mathematics) \break

}

\rsection{Introduction}

Motion is described by differential equations, which are
derived from the laws of physics. In the simplest case,
they read $m {d^2 x\over dt^2} = F(t,x,{dx\over dt})$---Newton's
second law. These equations contain
within them not just a statement of the current acceleration
experienced by the object(s), but {\it all} the physical laws 
relevant to the particular situation. Finding these laws
and their consequences for the motion has been a major
part of physics since the time of Newton. For example,
the equations tell us the space in which the system
evolves (its {\it phase space}, which may be ordinary
Euclidean space or a curved space such as a sphere); any
symmetries of the motion, such as the left--right or forwards--backwards
symmetries of a pendulum; and any special quantities such as
energy, which for a pendulum is either conserved (if there
is no friction) or decreases (if there is friction). 
Finally and most importantly, the laws describe how 
all motions starting close to the actual one are
constrained in relation to each other. These laws are
known as {\it symplecticity} and {\it volume preservation}.

\medskip{\narrower\small\em
\noindent
``A gyroscope is an emissary from a six-dimensional symplectic 
world to our three-dimensional one; in its home world
its behavior looks simple and natural.'' \hfil (Yuri Manin) \break

}
\medskip

Standard methods for simulating motion, called
{\it numerical integrators}, take an initial condition
and move the objects in the direction specified
by the differential equations. They completely ignore
all of the above hidden physical laws contained within
the equations. Since about 1990, new methods have been
developed, called {\it geometric integrators}, which
obey these extra laws. Since this is physically natural,
we can hope that the results will be extremely reliable,
especially for long-time simulations. 

Before I tell you all the advantages, two caveats:
\begin{itemize}
\item The hidden physical law usually has to be known
if the integrator is going to obey it. For example,
to preserve energy, the energy must be known.
\item Because we're asking something more of our method,
it may turn out to be computationally more expensive 
than a standard method. Amazingly (because the laws are
so natural?) sometimes it's actually much cheaper.
\item Many systems have multiple hidden laws, for which
methods are currently known which preserve any one law but not
all simultaneously.
\end{itemize}

Now the advantages:
\begin{itemize}
\item Simulations can be run for enormously long times, because
there are no spurious non-physical effects, such as
dissipation of energy in a conservative system;
\item By studying the structure of the equations, very simple,
fast, and reliable geometric integrators can often be found;
\item In some situations, results can be guaranteed to
be {\it qualitatively} correct, even when the motion is chaotic. 
This allows one to 
study systems in a ``quick and dirty'' mode and explore
the system thoroughly, while retaining reliability;
\item For some systems, even the actual quantitative errors
are much smaller for short, medium, and long times than
in standard methods.
\end{itemize}
Chapter 2 discusses a case where all of these nice
features are realized: the solar system.

The first lecture is about general tools which will
be useful later on, the second is about ``why bother?'',
and the third to sixth lectures are about how to
preserve various specific properties. Naturally, these
lectures are tailored to our own research interests, but
at the end of each lecture is a short list of references
to take you to the wider literature.

These lectures were delivered at ANODE, the Auckland
Numerical ODEs workshop, in July 1998. We are very grateful
to John Butcher for inviting us to speak, to all the organizers
of ANODE, and especially to Nicolas Robidoux for transcribing 
the lectures. ANODE and the authors are supported by the
Marsden Fund of the Royal Society of New Zealand.
The written form was prepared at the MSRI, Berkeley, 
supported in part by NSF grant DMS--9701755.

\rsectionshort{Flows}{The exact flow of an ODE, and general properties
of integrators}
We first define the exact flow (or solution) of an ordinary
differential equation (ODE) and discuss what properties 
one would like an integrator to have.
Let $x(t)$ be the exact solution of the system of
ordinary equations (ODEs)%
\footnote{Nonautonomous
ODEs $dx/dt = f(x,t)$ can be formulated autonomously
as $dx/dt = f(x,x_{m+1})$, $dx_{m+1}/dt = 1$. The geometric
integrator is applied to this ``extended'' system, and then
$t=x_{m+1}$ substituted.}
\begin{equation}
{dx\over dt} = f(x),\quad x(0)=x_0,\quad x\in\R^m.
\end{equation}
The exact flow $\phi_\tau$ is defined by
$$ x(t+\tau) = \phi_\tau(x(t))\quad \forall\ t,\ \tau$$
For each fixed time step $\tau$, $\phi$ is a map from
phase space to itself, i.e. $\phi_\tau:\R^m\to \R^m$.
\rsubsection{Three properties of exact flows}
\begin{enumerate}
\item {\bf (Self-adjointness)}
The flow has the continuous group property
\begin{equation}
\phi_{\tau_1}\circ \phi_{\tau_2} = \phi_{\tau_1 + \tau_2}
\quad\forall \tau_1,\tau_2\in\R.
\end{equation}
In particular,
\begin{equation}
\phi_{\tau}\circ \phi_{-\tau} = Id
\end{equation}
Hence the exact flow is {\em self-adjoint}:
\begin{equation}
\phi_{\tau}= \phi_{-\tau}^{-1}.
\end{equation}
\item {\bf (Taylor expansion)}
$$
x(\tau) = x(0) + \tau {dx\over dt}(0) + \tau^2
{1\over2}{d^2 x\over dt^2}(0) +\dots
$$
Substitute
$$\eqalign{
{dx\over dt} &= f(x) \cr
{d^2 x\over dt^2} &= (df){dx\over dt} = (df)f
}$$
Hence
\begin{equation}
\phi_\tau(x_0) = x_0 + \tau f(x_0) + {1\over2} \tau^2 (df(x_0))
f(x_0) +\dots
\end{equation}
\item {\bf (Formal exact solution)}
\begin{equation}
\eqalign{
\phi_\tau(x) &= e^{\tau\sum_{i=1}^n f_i(x){\partial\over\partial x_i}}(x) \cr
& := \exp(\tau f)(x)\cr
}
\end{equation}
\end{enumerate}

It's impossible to construct integrators 
with the Lie group property
for any reasonably general class of ODEs. The closest
one can come is to preserve self-adjointness.

\rsubsection{Properties of integrators}
In general we don't know the flow $\phi_\tau$, so we seek maps
$\psi_\tau$ that approximate $\phi_\tau$. We call such 
$\psi_\tau$ integrators. Some properties of integrators:
\begin{enumerate}
\item {\bf (Self-adjointness)}
It is useful for $\psi_\tau$ to be self-adjoint, i.e.,
$$\psi_\tau = \psi_{-\tau}^{-1}$$
\item {\bf (Order of an integrator)}
The order of accuracy of $\psi_\tau$ is $p$, if the Taylor
series of $\psi_\tau$ and the exact flow $\phi_\tau$ agree
to order $p$:
$$\psi_\tau(x)-\phi_\tau(x) = {\cal O}(\tau^{p+1})$$
\item {\bf (Consistency)}
A necessary property of $\psi_\tau$ is that it be consistent,
i.e., first order accurate, i.e.,
$$\psi_\tau(x) = x + \tau f(x) + {\cal O}(\tau^2).$$
\end{enumerate}
{\bf Note:} It is not difficult to show that every self-adjoint
integrator is of even order.

There are three types of integrators:
\begin{enumerate}
\item Integrators that form a group
\item Integrators that form a sandwich
\item Integrators that form a semigroup
\end{enumerate}

\rsection{Integrators that form a group}
Suppose we have a set $G$ of integrators which may or
may not be consistent.
If, for all integrators $\psi_\tau$ and $\chi_\tau$ in $G$, we
have 
$$ \psi_\tau\circ \chi_\tau\in G$$
and 
$$ \psi_\tau^{-1}\in G,$$
we say the integrators form a group. That is, they are a group
where the group operation is composition of maps.

Examples of integrators that can form a group:
\begin{enumerate}
\item symplectic integrators (Lecture 3)
\item symmetry-preserving integrators (Lecture 4)
\item volume-preserving integrators (Lecture 5)
\item integral-preserving integrators (Lecture 6)
\end{enumerate}

For example, for the group of integral-preserving integrators
there is a real function $I(x)$ (the {\em integral}) such
that $I(x) = I(\phi_\tau(x))$ for all $x$: the value of
the integral $I$ is preserved by the integrator. Therefore
it is also preserved by $\phi_\tau\circ \chi_\tau$ and
by $\phi_\tau^{-1}$: the integrators form a group.%
\footnote{If $\phi_\tau^{-1}$ exists, which it does for
the methods of Lecture 6, but not necessarily for projection
methods.}

\rsubsection{How to construct integrators that form a group}
The main way to construct integrators that form a group
is through {\em splitting methods}.
Splitting methods work for all cases (1)--(4) above, and
are discussed further in Lecture 3.

We illustrate splitting for integral-preserving integrators.
Assume we don't know an integral-preserving integrator
for the vector field $f$, but $f$ can be split into
two vector fields $f_1$ and $f_2$, each with the same
integral as $f$:
$$ f(x) = f_1(x) + f_2(x)$$
and assume  that we {\em do} know integral preserving
integrators $\psi_1$ (resp. $\psi_2$) for $f_1$ (resp.
$f_2$ separately.) 

Then we obtain an integral-preserving integrator $\psi$
for $f$ by composition:
$$\psi_\tau = \psi_{2,\tau} \circ \psi_{1,\tau}$$
This is a consistent method for $f$, because it is
the map $\psi_\tau:x\mapsto x''$ given by 
$$\eqalign{
x' &= x + \tau f_1(x) + {\cal O}(\tau^2) \cr
x'' &= x' + \tau f_2(x') + {\cal O}(\tau^2) \cr
& = x + \tau(f_1(x) + f_2(x)) + {\cal O}(\tau^2) \cr
}$$
Splitting methods are very easy to program---one merely calls
routines for $\psi_1$ and $\psi_2$ in turn.

Thus the problem becomes:
\begin{enumerate}
\item How to split vector fields while staying in the appropriate class;
\item How to construct integrators in the appropriate group;
\item How to compose those integrators so as to get an integrator
of the original vector field of the desired order.
\end{enumerate}
Each of these will be considered in these lectures.

\rsectionshort{Sandwiches}{Integrators that form a sandwich}
A set $G$ of integrators is called a {\em sandwich} 
if, for all integrators $\psi_\tau$ and $\chi_\tau$ in $G$, we
have 
$$ \psi_\tau\circ \chi_\tau \circ \psi_\tau\in G.$$
Notice that every group also forms a sandwich, but not
vice versa.

Examples of integrators that form a sandwich:
\begin{enumerate}
\item Self-adjoint integrators.
\item Integrators that possess time-reversal symmetry (Lecture 4).
\end{enumerate}
Proof of (1):
Let $\theta_\tau = \psi_\tau \circ \chi_\tau \circ \psi_\tau$. Then
$$\eqalign{
\theta_{-\tau}^{-1} &= \left(\psi_{-\tau}\circ \chi_{-\tau}
\circ \psi_{-\tau}\right)^{-1} \cr
& = \psi_{-\tau}^{-1} \circ \chi_{-\tau}^{-1} \circ \psi_{-\tau}^{-1} \cr
&= \psi_\tau \circ \chi_\tau \circ \psi_\tau \cr
&= \theta_\tau.
}
$$
\rsubsection{How to construct integrators 
that form a sandwich}
There are two main ways:
\begin{enumerate}
\item {\bf (Projection methods)}
If $\psi_\tau$ is any integrator,
then the ``projection'' 
$$\chi_\tau := \psi_{\tau/2}\circ \psi_{-\tau/2}^{-1}$$
is self-adjoint.
\item {\bf (Splitting methods)}
If $f$ can be split into two vector fields
$$ f(x) = f_1(x) + f_2(x) $$
such that we have self-adjoint integrators $\psi_1$ and $\psi_2$
for $f_1$ and $f_2$ separately, then we obtain a self-adjoint
integrator $\psi$ for $f$ from the symmetric composition
$$ \psi_\tau := \psi_{1,\tau/2} \circ \psi_{2,\tau} \circ
\psi_{1,\tau/2}.$$
\end{enumerate}
These are generalized to other sandwiches in Lecture 4.
The projection is almost miraculous, because it starts with
{\it any} integrator. There is no analogous projection for
groups.
\rsectionshort{Semigroups}{Integrators that form a semigroup}
A set $G$ of integrators forms a semigroup if
for all integrators $\psi_\tau$ and $\chi_\tau$ in $G$, we
have $\psi_\tau\circ \chi_\tau\in G$, but not necessarily
$\psi_\tau^{-1}\in G$. 

These arise from properties that only hold for forwards time:
\begin{enumerate}
\item systems with a Lyapunov function (Lecture 6)
\item systems which contract phase space volume
\end{enumerate}
For example, if the Lyapunov function is decreasing as
$t$ increases, it is {\em increasing} as $t$ decreases,
and even the flow $\phi_\tau^{-1}=\phi_{-\tau}$ does
not have the Lyapunov property.
This means one cannot use backwards time steps when composing
these integrators, which can be proved to limit the order of composition
methods to 2.

\rsectionshort{Composition methods}{Creating higher order 
integrators: composition methods}

Having obtained a geometric integrator $\psi_\tau$, a higher
order method can be obtained from the composition
$$\eqalign{\chi_\tau = 
\psi_{\alpha_n \tau}^{}
\psi_{-\alpha_{n-1} \tau}^{-1}
&\psi_{\alpha_{n-2} \tau}^{}
\psi_{-\alpha_{n-3} \tau}^{-1}
\dots \cr
&\dots \psi_{\alpha_{n-3} \tau}^{}
\psi_{-\alpha_{n-2} \tau}^{-1}
\psi_{\alpha_{n-1} \tau}^{}
\psi_{-\alpha_n \tau}^{-1}
}$$
which has been chosen to be self-adjoint, i.e.
$\chi_\tau = \chi_{-\tau}^{-1}$.
Here the number of integrators $n$ and the coefficients
$\alpha_n$ can be adjusted to obtain the desired order.

\begin{example}\rm
If $\psi$ is self-adjoint, then a fourth-order integrator
is obtained as follows:
$$ \chi_\tau = \psi_{\gamma\tau}\circ
\psi_{(1-2\gamma)\tau}\circ \psi_{\gamma\tau}$$
where $\gamma := (2-2^{1/3})^{-1}$.
\end{example}

This example is generalized in Theorem \ref{thm2} below.
Note that $1-2\gamma<0$. 

\rsubsection{An example of composition methods: the generalized
Yoshida method}

\begin{theorem}[Yoshida, Qin, and Zhu]\label{thm2}Let $\psi$ be
a self-adjoint integrator of order $2n$. Then
$$ \chi_\tau = \psi_{\gamma\tau}\circ
\psi_{(1-2\gamma)\tau}\circ \psi_{\gamma\tau},\quad
\gamma = (2-2^{1\over 2n+1})^{-1}$$
is a self-adjoint integrator of order $2n+2$.
\end{theorem}
\begin{proof}
Let
$$\psi_\tau(f) = \phi_\tau(f) + \delta \tau^{2n+1}+\dots.$$
Then, using the flow property of the exact flow $\phi_\tau$,
$$\eqalign{
 \psi_{\gamma\tau}&\circ 
\psi_{(1-2\gamma)\tau}\circ \psi_{\gamma\tau}
= \phi_\tau(f) + \cr
& \left(\gamma^{2n+1} + 
(1-2\gamma)^{2n+1} + \gamma^{2n+1}\right)
\delta \tau^{2n+1} + \dots\cr
}$$
which has order $2n+1$ if $\gamma$ is as given in the theorem.
However, it is self-adjoint by construction, so it has even
order, hence the order is $2n+2$.
\end{proof}

\references

{\small
\rsubsection{Books}
\begin{enumerate}
\item[1.] A.M. Stuart and A.R. Humphries, {\em Dynamical
Systems and Numerical Analysis}, CUP, 1996.
\item[2.] J.-M. Sanz-Serna and M.-P. Calvo, {\em Numerical
Hamiltonian Problems}, Chapman Hall, 1994.
\end{enumerate}
\rsubsection{Survey articles}
\begin{enumerate}
\item[3.] J.-M. Sanz-Serna, Geometric Integration, in 
{\em The State of the Art in Numerical Analysis}, I.S. Duff
and G.A. Watson, eds., Clarendon Press, Oxford, 1997, pp. 121--143.
\item[4.] G.R.W. Quispel and C. Dyt, Solving ODE's numerically 
while preserving symmetries, Hamiltonian structure, phase space
volume, or first integrals, in {\em Proc. 15th IMAC World Congress,
vol. 2}, A. Sydow, ed., Wissenschaft \& Technik, Berlin, 1997, 
pp. 601--607.
\end{enumerate}
\rsubsection{Composition methods}
The best place to start is
\be
\item[5.] H. Yoshida, Construction of higher order symplectic integrators,
{\em Phys. Lett. \bf 150A} (1990), 262--268
\ee
which was generalized in
\be
\item[6.] M. Qin and W.J. Zhu, Construction of higher order symplectic
schemes by composition, {\em Computing \bf 27} (1992) 309--321
\ee
and further in 
\be
\item[7.] R.I. McLachlan, On the numerical integration of ODE's by
symmetric composition methods, \em SIAM J. Numer. Anal. \bf 16
\rm (1995) 151--168.
\item[8.] A. Murua and J.M. Sanz-Serna, Order conditions
for numerical integrators obtained by composing simpler
integrators, {\em Phil. Trans. Roy. Soc. A}, 1999.
\ee
}

\chapter{Why preserve structure?}

\rsection{Introduction}

I'll start with an example of a simulation of the outer
solar system by Jack Wisdom and coworkers.
I like this example because there is such a long history of modeling
the solar system. The people who do this are not from the numerical
analysis community but they have their own history 
of methods which they have
developed and tweaked.

In the 1980's, a special-purpose supercomputer, the ``Digital
Orrery'', simulated the outer planets for 845 million years.
With a lot of tweaking, an energy error of about $10^{-9}$
was achieved with a time step of 45 days (a six month
calculation!).
A calculation with a very high order symmetric multistep method
achieved an energy error of about $10^{-10}$ in a 3 million
year simulation, with a time step of 0.75 days. In a completely
different approach, Laskar (1990) used classical perturbation
theory (expanding in mass ratios and eccentricities about 
circular orbits) to eliminate the fast (annual) frequencies. This
required 250,000 terms, but a time step of 500 years could be
taken.

All of these attempts were roundly routed by the
calculation of Jack Wisdom et al., using a very simple, elegant
symplectic integrator. Their billion year simulation
with a time step of 7.5 days gave an energy error
of only $2\times 10^{-11}$. Moreover, only one force evaluation
was used per time step, making the method very fast.

\plot{wisdom1.ps}{Energy error of leapfrog applied to the whole
solar system over $10^8$ years (Wisdom et al.)}

\plot{wisdom2.ps}{Energy error after application of corrector $\chi_\tau$.}

Roughly speaking, they wrote the ODE 
as a sum of uncoupled Kepler 2-body problems and the
potential which couples the planets:
$f = f_1 + f_2 = f_{\rm Kepler} + f_{\rm coupling}$.
Each $f_i$ is a Hamiltonian system, and the flow
$\phi_\tau(f_i)$ of each can be found exactly and quickly
(the 2-body problems using an elegant method of Gauss).
The time stepping is simply the simplest composition
$\psi_\tau = \phi_\tau(f_2)\circ\phi_\tau(f_1)$---a form
of the ``leapfrog'' method. Since the flow of Hamiltonian 
ODEs is symplectic,
and symplectic maps form a group, $\psi_\tau$ is symplectic.
Moreover, they found a  ``corrector'' $\chi_\tau$ such that
$$\chi_\tau\circ \psi_\tau\circ \chi_\tau^{-1} = \phi_\tau + {\cal O}
(m^2 \tau^3)$$
where $m\approx |f_2/f_1|\approx 10^{-3}$ is the mass ratio
between Jupiter and the sun. 
(The result after $n$ time steps is
$ \chi_\tau \circ \psi_\tau^n \circ\chi_\tau^{-1}$, so that 
$\chi_\tau$ only needs to
be evaluated once, no matter how long the simulation.)
This method:
\begin{itemize}
\item is symplectic;
\item is one-step;
\item is explicit;
\item is second order;
\item uses one force evaluation per time step;
\item exploits classical analysis, namely the exact solution of the
2-body problem;
\item preserves total linear and angular momentum;
\item is self-adjoint and reversible;
\item has an extra factor of $m^2=10^{-6}$ in its local truncation
error, compared to classical methods;
\item for moderate times ($\approx 2\times 10^7$ years), 
has linear growth of global errors, 
compared to quadratic growth for classical methods;
\item has bounded energy errors for long times.
\end{itemize}

This is almost a dream situation, where geometric integration has
lead to a simple method with vastly improved local (time $\tau$), 
global (time $T$), and structural (time $\infty$) errors.
This calculation discovered chaos in the outer solar system with
a Lyapunov time, the time for the separation between nearby orbits 
to grow by
a factor $e$, of 20 million years. Over the billion year
calculation, they would separate by $e^{50}\approx 10^{22}$, and
integration errors would be magnified by this amount also. 
Thus, the final angular positions of the planets are not expected
to be accurate. However, we can be confident that the qualitative
or statistical properties of the solution are correct.

\plot{wisdom3.ps}{Inclination of Pluto over $10^9$ years, showing chaos.
Even after $10^9$ years the inclination has reached a new maximum.}

\rsection{Phase space and phase flow}
The fundamental idea to keep in mind is to think in phase space.
It's a simple idea but one which you have to keep reminding yourself
of. I certainly remember how long it took me to get the idea of phase
space. But, considering that differential equations were studied
for 200 years before Poincar\'e adopted this point of view, perhaps
I shouldn't be too hard on myself.

\medskip
{\narrower\small
\noindent
\em ``Consider the fluid molecules which initially form a certain figure
$F_0$; when these molecules are displaced, their ensemble forms a new
figure which will be deformed in a continuous manner, and at the instant
$t$ the envisaged ensemble of molecules will form a new figure $F$.''

{\hfill (Poincar\'e, {\em Celestial Mechanics,} 1899)}

}

\medskip

In a {\em trajectory} $\phi_t(x_0)$, one thinks of the initial
condition $x_0$ as fixed, and the time $t$ increasing;
in the {\em flow map} $\phi_\tau(x)$, one thinks of 
all initial conditions $x$ flowing forward
for some fixed time $\tau$.
We'll only consider one-step methods, so that
the numerical approximation for one time-step $h$ is a map
$$\psi_\tau: \R^m \to \R^m.$$

Now 
classical approximation theory, e.g. for Runge-Kutta methods, shows
that chaos always wins: the best bound that can be obtained in 
general for a method of order $p$ is 
$$\left| \psi_\tau^{T/\tau}(x) - \phi_T(x) \right|
\le (\Delta t)^p C {e^{\Lambda T}-1 \over \Lambda}$$
$$\Lambda = f(\hbox{\rm Lipschitz constant, method})$$
But dynamical systems theory teaches that 
$\psi$ can be ``close'' to $\phi$ in other ways.

\rsubsection{The pendulum: theory}
Systems can have many geometric or structural properties.
Before we get into definitions, let's look at the planar
pendulum. It is a two-dimensional system with phase space
$\R^2$, and dynamics
\begin{equation}
\label{pendulum}
\dot q = p,\quad \dot p = -\sin q
\end{equation}
where $q$ is the angle of the pendulum, and $p$ its angular
momentum. (Here we are taking $q\in \R$, the covering
space of the actual angle.)
Here are some of the properties of the pendulum:
\begin{itemize}
\item It conserves the total energy $\dot H = {1\over2}p^2 -\cos q$.
That is, its flow stays on the level sets of this function. Because
this is a two-dimensional system, these level sets are curves in
the plane.
\item Being a Hamiltonian system, its flow is symplectic. For
two-dimensional systems, this is equivalent to being area-preserving.
\item It has one discrete symmetry and one discrete reversing
symmetry (see Lecture 4). 
The symmetry, $q\mapsto -q$, maps the vector field
into itself; the reversing symmetry, $p\mapsto -p$, maps
the vector field into minus itself. Imagining flowing along
one of the solution curves, you can see that the motion of
the reflected points is constrained.
\end{itemize}

\wideplot{cats.ps}{Phase portrait and flow of the pendulum (from 
Hairer and Wanner). The area of each cat is preserved in
time, the manifestation of symplecticity. Energy, whose levels
sets are the curves shown, is preserved. Flipping left-right
($q\mapsto -q$) is a symmetry, while flipping up-down
($p\mapsto -p$) is a reversing symmetry.}

Because this is such a simple system, preserving any of
these three properties gives a geometric integrator with
good long-time behavior for almost all initial conditions.
A picture of its phase portrait will look very similar to
the true phase portrait. 
By contrast, standard methods (e.g. Euler's method) destroy
the qualitative phase portrait completely.


\rsection{Philosophy of geometric integration}
\label{sec:philosophy}

In any numerical study, one should
\begin{itemize}
\item examine any geometric or structural properties of the ODE or its
  flow;
\item design numerical methods which also have these structural
  properties; and
\item examine the consequences, hopefully over and above the immediate
  ones.
\end{itemize}

This encourages us to
\begin{itemize}
\item confront questions of phase space and degrees of freedom;
\item think about the significance of local, global, and qualitative
  errors; and
\item think about the kinds of tools and functions allowed in
  numerical analysis.
\end{itemize}

For example, multistep methods do not define a map on phase
space, for more than one initial condition is required. They
can have geometric properties, but in a different (product) phase
space, which can alter the effects of the properties. (See Fig. 
\ref{pend5.ps}.)
This puts geometric integration firmly into the ``single step'' camp.
If a system is defined on a sphere, one should stay on that
sphere: anything else introduces spurious, non-physical degrees
of freedom.

The {\em direct} consequences of geometric integration are that we are
\begin{itemize}
\item studying a dynamical system which is close to the
  true one, and in the right class; and
\item this class may have restricted orbit types,
  stability, and long-time behavior.
\end{itemize}

In addition,
because the structural properties are so natural, some
{\em indirect} consequences have been observed. For example,
\begin{itemize}
\item symplectic integrators have good energy behavior;
\item symplectic integrators can conserve angular momentum
  and other conserved quantities;
\item geometric integrators can have smaller local
  truncation errors for special problems, and smaller global
  truncation errors for special problems/initial conditions
  (even though they're larger in the ``generic'' case);
\item some problems (particle scattering, isospectral
  problems) can have errors tending to zero at long times.
\end{itemize}

\plot{ch-sc2.ps}{Phase portrait of a symplectic integration,
from Channell and Scovel.
$10^5$ time steps for 10 different initial conditions are shown.
Smooth curves (``KAM tori'') correspond to regular, quasiperiodic
motion; clouds correspond to chaotic motion.}

Here's a pictorial survey showing what you can expect from
geometric integration.
Fig. \ref{ch-sc2.ps} appears in Channell and Scovel [3], 
one of the first symplectic integration papers. Orbits starting
on the smooth curves (``invariant circles'') stay on them 
forever. 
When I saw these pictures, I just could not believe this; the
idea that errors grow and, once committed, cannot be undone, was
deeply ingrained. Of course, the orbit may be going around the
circle at the wrong speed. But the ``orbital error'' does not
grow in time.

Other orbits in Fig. \ref{ch-sc2.ps} are chaotic, 
and their position errors grow
exponentially. But, they can never jump across the invariant circles,
and because it's the {\em right kind} of chaos (namely, the
solution of some nearby Hamiltonian system), statistical
observations of this chaos will have small errors.

\rsection{Types of geometric properties}
Study the list in the Table.
The left hand column gives properties
of vector fields, and the right hand column gives the corresponding
properties of their flow. It's the right hand property that
must be preserved by the integrator. 
Usually the flow properties are named the same as the ODE property.

\begin{table*}[t]
\begin{center}
\begin{tabular}{|ll|l|}
  \hline && \\
  ODE $\dot x = f(x)$ && flow $\phi_t$, derivative $d\phi_t$ \\
  && \\
  \hline
  && \\
  Hamiltonian & $f = J \nabla H(x)$, $J=\left({0\ I\atop -I\ 0}\right)$
  & $d\varphi^T J d\varphi = J$ (symplectic) \\
  Poisson & $f = J(x)  \nabla H(x)$ 
  & $d\varphi^T J d\varphi = J\circ \varphi$ \\
  source-free & $\nabla\cdot f = 0$ 
  &  $\det d\varphi = 1$ (volume preserving)\\
  symmetric & $dS.f = f \circ S$ 
  & $S\circ\varphi  = \varphi\circ S $ \\
  reversible & $-dR.f = f \circ R$ 
  & $R\circ\varphi^{-1} = \varphi\circ R$ \\
  Lie group & $f = a(x)x$,
  $x\in G$, $a\in {\mathfrak g}$
  & $\phi\in G$ \\
  isospectral & $f =  [b(x),x]$, $x$, $b\in {\mathfrak g}$
  & eigenvalues $\lambda(x)$ constant \\
  integral & $f\cdot \nabla I = 0$ 
  & $I(x(t))=I(x(0))$ \\
  dissipative & $f\cdot \nabla V\le 0$ 
  & $V(x(t))\le V(x(0))$ \\
  && \\
  \hline
\end{tabular}
\label{table1}
\caption{Special classes of ODEs, and the corresponding
properties of their flows.}
\end{center}
\end{table*}

(The standard example of a {\em Lie group} $G$ is the set of
orthogonal $3\times3$ matrices, $A A^T=I$, which represent
rotations. Its Lie algebra $\mathfrak g$ is the set of antisymmetric 
$3\times3$ matrices, which has the great advantage of
being a linear space.)


To bring some order to this table, consider the following features.
\begin{itemize}
\item Is the structure linear in some sense? \hfil\break
All of the ODE properties are linear in $f$, but all of
the flow properties are nonlinear in $\phi$, except for
symmetries. Symplecticity, Poisson, and reversibility 
are quadratic; volume preservation and isospectrality
are degree $m$ when $x\in \R^m$.
\item Does the structure appear explicitly or implicitly in the ODE?
\hfil\break
Hamiltonian, Poisson, Lie group, and isospectral ODEs are 
explicit (e.g. $f=J\nabla H$
generates {\em all} Hamiltonian ODEs); the rest are implicit---there
are side conditions which $f$ has to satisfy.
\item Does the flow property depend on $\varphi$ or $d\varphi$?
\hfil\break
Symplecticity, Poisson, and volume preservation depend on 
the Jacobian $d\phi$. This makes them hard to preserve.
\end{itemize}
These will be explored further in the other lectures.
Briefly, it is easier to work on {\em linear} and 
{\em explicit} properties, so we concentrate on bringing
them into this form.

A major justification for geometric integration comes from
{\em backward error analysis}. This theoretical tool
writes the integrator $\psi_\tau$ as the time-$\tau$ 
flow of {\em some} vector field $\tilde f$, 
i.e. $\psi_\tau(f) = \phi_\tau(\tilde f)$.
If the method is of order $p$, we have $\tilde f = f + {\cal O}(\tau^p)$.
Then, in many cases one can argue that since $\psi_\tau$ is in 
some class (e.g. symplectic), the perturbed vector field must
be in the appropriate class too (e.g. Hamiltonian). So we know
that by studying the dynamics of the method, we are at least
studying dynamics in the right class. The reliability of the results
then depends on the ``structural stability'' of the original system:
a difficult problem, but a standard one in dynamical systems.

In the Hamiltonian case, $\tilde f = J \nabla \widetilde H$ for
some Hamiltonian $\widetilde H$, which is conserved by the method.
Since we don't know $\widetilde H$ and can only measure the original
energy $H$, it will be seen to oscillate, but (if the levels sets
of $H$ and $\widetilde H$ are bounded) not drift away from its
original level.

Technically, one suspends the map $\psi_\tau$ to a time-dependent
flow $\phi_\tau(\tilde g(x,t))$, from which, when $\psi_\tau$
is analytic, nearly all the time dependence can be removed
by a change of variables, giving $\tilde f(x) + {\cal O}(e^{-1/\tau},t)$.
This is inevitable, because most maps, even those close to the identity,
are not actually flows. If the time step is too large these exponentially
small terms can actually pollute the calculation, and one observes,
for example, the energy drifting.

\rsection{Miscellaneous topics}

Some other branches of geometric integration are
\begin{itemize}
\item {\em ODEs on manifolds}, such as homogeneous spaces. 
Although ultimately one can only compute in a linear space, it's
best to formulate the method on the manifold and transfer to
coordinates as late as possible.
\item {\em Mapping methods} approximate the equations in $x$ as
well as in $t$, for example, by Taylor series. Maps defined by
series can then be manipulated analytically.
\item When evaluating {\em Lyapunov exponents} one should
try to preserve their structure, e.g., that the Jacobians
used are symplectic or volume-preserving.
\item For {\em partial differential equations} one can either
discretize in space first, seeking a finite-dimensional version
of, e.g., the Hamiltonian structure, or discretize space-time
directly.
\item One can discretize phase space itself and study
{\em lattice maps}, a form of cellular automata. This
has been used in studies of the effect of roundoff error.
\item Instead of trying to construct special methods that preserve
particular properties, one can see how well standard methods
do. Usually the property has to be fairly robust, e.g., dissipation
of the type $d|x|^2/dt<0$ for $|x|>R$ is studied, 
instead of $dV/dt\le 0$ for all $x$. This approach is thoroughly
treated in Stuart and Humphries, {\em Dynamical Systems and 
Numerical Analysis}.
\end{itemize}

\rsection{Growth of global errors}

The global error is $\psi_\tau^{T/\tau}(x)-\phi_\tau(x)$
where $T$ is a large, but fixed, time.
Geometric  integrators are not expressly designed to control
the global error. Nevertheless,
sometimes it grows linearly in a symplectic integrator and
quadratically in a standard integrator. This will make
the symplectic integrator superior if $T$ is large enough.

This property has been observed in many systems of different 
types. It is associated with preservation of {\em invariant tori}
by the method. An invariant torus is a subset of initial
conditions, topologically a torus, which orbits starting on
stay on for all forwards and backwards time.  A torus
is {\em preserved} if the integrator has an invariant torus
of its own, which tends to the torus of the ODE as $\tau\to0$.

\plot{arnold1.ps}{Flow on a family of invariant tori. From V.I. Arnol'd,
Small denominators and problems of stability of motion in
classical and celestial mechanics, 
{\em Uspehi Mat. Nauk (Russ. Math. Surv.) \bf 18} (1963) no. 6 (114) 91--192.}

\rsubsection{Invariant tori}
Invariant tori are ubiquitous in dynamics. They're found in:
\begin{itemize}
\item Hamiltonian systems (dimension $n/2$);
\item reversible systems (when orbits intersect the symmetry plane; 
dimension often $n/2$);
\item volume-preserving systems (any dimension $<n$).
\end{itemize}
They are important because they
\begin{itemize}
\item form positive-measure families of neutrally stable orbits, which
\item mostly persist under small perturbations of the system;
\item form ``sticky sets,'' dominating behavior of nearby orbits on
  intermediate time scales
\end{itemize}
Nearby orbits diverge like
\begin{itemize}
\item ${\cal O}(1)$ on same torus
\item ${\cal O}(T)$ on a nearby or perturbed torus
\item ${\cal O}(T^2)$ if ${\cal O}(T)$ drift across tori
\item ${\cal O}(T,e^{\lambda T})$ on nearby chaotic orbits;
$\lambda$ depends on order of resonance, can be very small.
\end{itemize}
Therefore, in an integrator we should try to
preserve tori {\em of the correct dimension}.
In a standard method, they are not preserved, and orbits
drift transversely, leading to ${\cal O}(T^2)$ growth
of global errors. If the torus is preserved, orbits only
move around the torus at a slightly wrong angle or speed,
leading to ${\cal O}(T)$ errors.

It turns out to be an extraordinarily subtle question to
determine when which tori persist under which perturbations. Finally,
in the 1960's, conditions were found by Kolmogorov, Arnol'd, and
Moser under which most tori do persist under appropriate
perturbations, although some are destroyed. This forms the
subject of KAM theory.

\plot{arnold2.ps}{Cross-section of the tori in Fig 2.6 after 
perturbation (Arnol'd). Some are destroyed and replaced by
chaos, some persist.}

For Hamiltonian systems, an appropriate perturbation is
Hamiltonian, so the results apply to symplectic integrators.

In between invariant tori, or if tori were destroyed by
taking too large a time step, orbits can be chaotic.
But, because of the nearby tori, exponential separation
can be very slow, and the linear error growth can
dominate for long times. 

\rsection{The pendulum: numerical experiments}

\plot{pend1.ps}{1000 times steps of symplectic leapfrog
applied to the pendulum, time step $\tau=0.1$.}
\plot{pend2.ps}{As in Fig. \ref{pend1.ps}, but $\tau=1$.}
\plot{pend3.ps}{A nonsymmetric symplectic integration
of the pendulum, $\tau=0.1$.}
\plot{pend4.ps}{$10^6$ time steps of leapfrog at
$\tau=1$, showing a chaotic orbit.}
\plot{pend5.ps}{A symplectic multistep method: the tori
have dimension 2 instead of 1 as in Figs. \ref{pend1.ps}--%
\ref{pend3.ps}.}

We illustrate the above points on the simplest meaningful
example, the pendulum (Eq. (\ref{pendulum})). The simplest
symmetric, reversible, self-adjoint symplectic method is leapfrog:
$$\eqalign{
q' &= q + {1\over2}\tau p \cr
p' &= p - \tau \sin q' \cr
q'' &= q' + {1\over2}\tau p' \cr
}$$
The results of this method are shown in Fig. \ref{pend1.ps}
for a small time step ($\tau = 0.1$) and in Fig. \ref{pend2.ps}
for a much larger time step ($\tau=1$). Even for the larger
time step, the left-right and up-down symmetries are preserved,
as are most of the invariant circles, as promised by KAM theory
for symplectic integrators. Chaos is significant only in a small
neighborhood of the homoclinic orbit connecting $(\pi,0)$ and
$(-\pi,0)$.

A symplectic method which is {\it not} symmetric or
self-adjoint is shown in Fig. \ref{pend3.ps}; the lack
of symmetry is plain to see. In this case,
invariant circles are still preserved. In higher-dimensional
systems, there is a more complicated interaction between
symplecticity and reversibility.

What is the effect of the chaos created by the numerical
integrator? Fig. \ref{pend4.ps} shows one chaotic orbit
of leapfrog at the large time step $\tau=1$, obtained
with initial condition $(q,p)=(0,1.8)$. It was found
to have a large Lyapunov exponent of $10^{-2}$. By
$T\sim100$, the chaos would dominate the numerical errors.
by contrast, with initial condition $(q,p)=(0,1.6)$, 
the Lyapunov exponent is already reduced to $10^{-7}$, and
phase errors (moving around the circle at the wrong speed)
would dominate until $T\sim 10^7$. Thus, even when KAM theory
does not strictly apply, preservation of {\em some} invariant
tori helps a great deal.

In Section \ref{sec:philosophy}, we talked about the importance
of staying in the right phase space. The multistep
method $x_{n+1} = x_{n-1} + 2\tau f(x_n)$ is a map on the
{\it product} phase space $\R^2\times \R^2$. It can be
shown to be symplectic in this larger space, but its KAM
tori have dimension 2, instead of 1 as in the real system.
When projected to the original phase space, they fill
out a solid region, instead of a curve---a disaster for
long-time simulations. This effect
is illustrated in Fig. \ref{pend5.ps}.

\rsection{Summary}

Systems may have many geometric or structural features.
Integrators must balance costs, local, global, 
and long-time errors, stability, 
and	structural preservation. You can't expect
to do well at all of these simultaneously! Also, numerical
studies can have different goals. Demanding very small local
errors for a large class of ODEs tilts the balance in favor
of highly-developed standard methods; 
seeking reliability over long
times with simple, fast methods tilts in favor of geometric
integrators.

The remaining lectures look at preserving different properties.
Here we sum up what is known about preserving several properties
at once.
\begin{enumerate}
\item Symplecticity and energy: If, by ``integrator'', we mean
that the method is defined for {\it all} Hamiltonian ODEs,
then by a theorem of Ge, this is impossible. For exceptional
(``completely integrable'') problems, such as the free rigid body,
this can be done.
\item Symplecticity and integrals apart from energy: Not known,
although doable in principle.
\item Symplecticity and linear symmetries: Achieved by, e.g.,
the implicit midpoint rule.
\item Poisson and linear symmetries: Not known.
\item Volume preservation and linear symmetries: Not known. 
\item Integrals and linear symmetries: Sometimes possible using
the Harten, Lax and Van Leer discrete gradient (see Lecture 6).
\item Volume and an integral: For all systems with some integrals,
or some systems with any integrals, can be done by splitting. Not known
in general.
\end{enumerate}

Extending the concept of geometric integration to PDEs is 
much less developed. Work has been done, e.g.,  on integral
preservation [7], symmetry preservation [8], and Lagrangian
(variational) structure [9].

\references
{\small
\rsubsection{Background on numerical ODEs}
\be\item[1.] E. Hairer, S. N\o rsett, and G. Wanner, {\em Solving Ordinary 
Differential Equations I: Nonstiff Problems}, 2nd ed., Springer, 1993.
\ee
\rsubsection{Historical illustrations}
\be\item[2.] J. Wisdom and M. Holman, Symplectic maps for the $N$-body problem,
{\em  Astron. J. \bf 102} (1991), 1528--1538;
J. Wisdom, M. Holman, and J. Touma,
Symplectic correctors, in {\em Integration Algorithms for Classical
Mechanics}, Fields Institute Communications {\bf 10}
(1996), 217--244.  
\item[3.] P.J. Channell and J.C. Scovel, 
Symplectic integration of Hamiltonian systems,
{\em Nonlinearity \bf 3} (1990), 231--259.
\ee
\rsubsection{Equations on manifolds}
\be \item[4.] C.J. Budd and A. Iserles, Geometric integration:
Numerical solution of differential equations on manifolds,
{\em Phil. Trans. Roy. Soc.}, to appear.
\ee
\rsubsection{Backward error analysis, invariant tori, and error growth}
\be \item[5.] E. Hairer and Ch. Lubich, 
The life-span of backward error analysis for numerical integrators,
{\em Numer. Math. \bf 76} (1997), 441--462. 
\item[6.] B. Cano and J.M. Sanz-Serna, Error growth in the numerical
integration of periodic orbits, with application to Hamiltonian
and reversible systems, {\em SIAM J. Numer. Anal. 34} (1997)
1391--1417.
\ee
\rsubsection{PDEs}
\be
\item[7.] R.I. McLachlan, Spatial discretization of partial
differential equations with integrals, {\em SIAM J. Numer. Anal.},
submitted.
\item[8.] V. Dorodnitsyn, Finite difference methods entirely
inheriting the symmetry of the original equations, in N. 
Ibragimov, ed., {\em Modern Group Analysis: Advanced
Analytical and Computational Methods in Mathematical Physics},
Kluwer, 1993, pp. 191--201; C.J. Budd, G.J. Collins, W.Z.
Huang, and R.D. Russell, Self-similar numerical solutions of
the porous medium equation using moving mesh methods, {\em
Phil. Trans. Roy. Soc. A}, 1999.
\item[9.] J.E. Marsden, G.W. Patrick, and S. Shkoller,
Multisymplectic geometry, variational integrators, and
nonlinear PDEs, {\em Comm. Math. Phys.}, to appear.
\ee
}

\chapter[Symplectic integrators]{Symplectic integrators: A 
		case study of the molecular dynamics of water.}
{\narrower\small
\noindent
\it ``Chemistry is a science, but not Science; for the criterion of true science
lies in its relation to mathematics''\hfil (Kant)\break

\noindent 
``Chemistry will only reach the rank of science when it shall be found possible
to explain chemical reactions in the light of their causal relations to the
velocities, tensions and conditions of equilibrium of the constituent molecules;
that the chemistry of the future must deal with molecular mechanics by the
methods and in the strict language of mathematics, as the astronomy of Newton
and Laplace deals with the stars in their courses'' \hfil (Du Bois Reymond)\break

}

This quote (from D'Arcy Thompson's {\em
On Growth and Form}) could not be more apt: symplectic
integrators, developed to deal with the stars in their courses,
are now applied to the velocities of molecules. 

There are many fine surveys of symplectic integration, so here
I'll talk about {\em Poisson}, or noncanonical Hamiltonian
systems, and how they arose in a study of water. Water, 
the ``king of polar fluids,'' has many strange phases and
anomalous properties, which statistical mechanics has a hard
time explaining. Therefore people turn to numerical simulations.

\rsection{Splitting}
Recall the problem of splitting---how can we write 
$f=f_1+f_2$ so that the $f_i$ retain some
properties of $f$? This can be done for Hamiltonian
systems by splitting the Hamiltonian. Look at
Table 2.1: Hamiltonian systems are
expressed explicitly. Therefore, the idea is to represent
all $f$ in the given class explicitly, by 
a ``generating function.'' Then we split the generating function.

\begin{example} Hamiltonian systems. \rm 
The generating function is the Hamiltonian $H$.
$$ f = J\nabla H = J\nabla\Big(\sum_i H_i\Big) = J\nabla H_1 +\dots + 
J\nabla H_n.$$
Properties due to $J$, which is not split, are retained---symplecticity.
Properties due to $H$, which is split, are lost---conservation of $H$.
\end{example}

\begin{example} Systems with an integral. \rm
The generating function is the skew-symmetric matrix function $J$.
$$ f = J\nabla H = \Big(\sum_i J_i\Big)\nabla H = J_1\nabla H + \dots + J_n \nabla H$$
Properties due to $J$, which is split, are lost---symplecticity.
Properties due to $H$, which is not split, are retained---conservation of $H$.
\end{example}
We'll return to systems with an integral in Lecture 6, and see
how to apply splitting to volume-preserving systems in Lecture 4.

\rsection{Poisson systems}
Consider a standard, canonical Hamiltonian system.
$$ \dot x = J \nabla_x H(x),\quad J = \left(\matrix{0 & I \cr -I & 0 }\right)$$
In new variables $y=g(x)$, this becomes
$$\eqalign{
\dot y & = dg\cdot \dot x \cr
&= dg\cdot J\nabla_x H(x) \cr
& = dg\cdot J\cdot dg^T \nabla_y H(x) \cr
& = \widetilde J(y)\nabla_y \widetilde H(y),\cr
}$$
a ``Poisson system,''
where 
$$\eqalign{ \widetilde J & = dg \cdot J \cdot (dg)^T \cr
\widetilde H(y) & = H(x). \cr}$$
But the class of Poisson systems in invariant under changes of
variables. Since the history of mathematics and of physics is
a history of requiring invariance under more operations, 
it seems we should
study Poisson systems in their own right.

(There are many other motivations, from PDEs, systems
on Lie groups and other manifolds, and symmetry reduction.) 

An important special case are the ``Lie''-Poisson systems.
Let $x\in \R^n$ be an element of a Lie algebra.
Let $[x_i,x_j] = \sum_{k=1}^n c_{ij}^k x_k$ be the Lie bracket.
Let $J_{ij} = [x_i,x_j]$, so that the entries of $J$ are linear functions of $x$.
Then 
$$\dot x = J(x) \nabla H(x)$$
or
$$ \dot x_i = 
\sum_{j,k} c_{ij}^k x_k {\partial H\over \partial x_j}$$
is called a Lie-Poisson system.

\begin{example} The free rigid body in $\R^3$. \rm The variables are $\pi_1$, $\pi_2$,
$\pi_3$, the angular momenta of the body in body-fitted coordinates.
$$ J = \left(\matrix{ 0 & \pi_3 & -\pi_2 \cr -\pi_3 & 0 & \pi_1 \cr
\pi_2 & -\pi_1 & 0 } \right)$$
$$ H = {1\over2}\left( {\pi_1^2\over I_1} + {\pi_2^2\over I_2} +{\pi_3^2\over I_3} \right)
$$
Here the Lie algebra is $so(3)$, the antisymmetric $3\times 3$ matrices.

\end{example}

\rsection{Splitting into solvable pieces}
Earlier we showed how to split a vector field into appropriate pieces, and
how to compose their flows. 
But, it is still important to be able to apply a geometric
integrator to each piece. Here we achieve this by requiring
the pieces to be (easily) integrable.

\medskip\noindent
{\bf Observation I} If $J_{ij}=0$ for $1\le i,j\le k<n$ and $H = H(x_1,\dots,x_k)$,
then the ODEs are
$$ \dot x = \left(\matrix{ 0 & * \cr * & * } \right) 
\left ( \matrix{ * \cr 0 }\right)$$
or
$$ \dot x_i = \left\{\matrix{0 & i=1,2,\dots,k \cr
c_{ij}^l x_l f_j(x_1,\dots,x_k) & i=k+1,\dots,n } \right.
$$
which are linear with constant coefficients, hence easily solved (although the
coefficients depend parametrically on the other variables $x_1,\dots,x_k$).

\medskip\noindent
{\bf Observation II} Systems with $H=\sum_i H_i(x_i)$ can be split into easily
solved parts. The rigid body has this form.

\medskip\noindent
{\bf Observation III} Quadratic Hamiltonians can be diagonalized, i.e., put
in the form of Observation II, and hence split.

\rsection{Molecular dynamics}

The basic steps in a molecular dynamics simulation are the following.
\begin{enumerate}
\item Take a large sea of particles.
\item Impose boundary conditions (e.g. 3D periodic), and three of pressure, volume,
temperature, and number of particles; the fourth is determined.
\item Find a classical model of the interparticle forces.
\item Move the particles for a long time.
\item Collect statistics of the motion.
\end{enumerate}

Applications are to 
exploring states of matter (phase transitions, liquid, liquid
crystal, colloidal), protein folding, the design of large 
organic molecules and drugs, nanotechnology. It's a big field.

Current limits are about 
$ 10^8$ simple atoms on a supercomputer,
$10^5$ simple atoms on a workstation, and
$10^{2}$--$10^3$ water molecules on a workstation.
For water, even $27=3\times3\times3$ molecules with periodic
boundary conditions are enough to see solid, liquid,
and gas phases.

What's the best way to move the particles? The method should

\begin{itemize}
\item obey the physical laws
\item exhibit the correct statistical equilibrium in the face of chaos
\item be fast and cheap, since forces are expensive to evaluate
\end{itemize}

In fact, the forces are {\em so} expensive that users don't
want to evaluate them more than once per time step. For decades
they've been using the Verlet method for point masses:
$$ H = {\rm kinetic\ } + {\rm potential} = {1\over2} p^2 + V(q),$$
$$\eqalign{q_{n+1} &= q_n + \tau p_n \cr
p_{n+1} &= p_n - \tau \nabla V(q_{n+1})}$$
We now know that it's so good because it's the simplest
symplectic integrator, and comes from splitting the Hamiltonian.

How can we extend the Verlet algorithm to non-symmetric molecules 
like water? Many approaches have been considered.

\begin{itemize}
\item Move each atom separately?
	No, the vibrational time scales within each atom are too short
\item
Model as a rigid body?
\begin{itemize}
\item	as a free particle + constraints?
Constraints lead to expensive, implicit methods. They are needed for
problems involving flexible chains, e.g. proteins.

\item	represent orientation by Euler angles or unit quaternions?
The Hamiltonian in these variables is complicated and nonseparable; 
this makes symplectic methods expensive.
In practice, unit quaternions are used and the ODEs are passed to
a black-box solver
\item   represent orientation by a $3\times3$ orthogonal 
matrix, and update this by rotations only.
\end{itemize}
\end{itemize}

This last is what we now do. First, let's look at the free rigid body.

\plot{rigidbody.ps}{The phase portrait of the free rigid body
(from Bender \& Orszag, {\em Advanced Mathematical Methods for 
Scientists and Engineers}.)
Orbits on a sphere of constant angular momentum $|\pi|$ are shown.
There are three pairs of fixed points, corresponding to rotation about
each of the three principal axes; two are stable and one is unstable.
To observe passage along the ``homoclinic orbit'' joining the
unstable pair, hold a hammer with the head horizontal and toss
so it rotates once about a left-right axis.}

\rsection{The free rigid body}
Let the angular momentum vector be $\bf\pi$, 
the orientation be $Q\in SO(3)$, i.e.
$Q^T Q = Id$; the Hamiltonian for the free rigid body is
$$ H = {1\over2}\left( {\pi_1^2\over I_1} + 
{\pi_2^2\over I_2} +{\pi_3^2\over I_3} \right).
$$
The phase portrait for this system is shown in the famous Figure 
\ref{rigidbody.ps}.

As noted above, splitting methods work excellently for Lie-Poisson
systems with Hamiltonians of this type.
The flow of $H={\pi_1^2\over 2 I_1}$ is 
$$\eqalign{
R & = \left(\matrix{ 1 & 0 & 0 \cr 0 & \cos\theta & -\sin\theta \cr
0 & \sin\theta &\cos\theta }\right) \cr
\theta &= t {\pi_1\over I_1} \cr
\pi(t) &= R \pi(0) \cr
Q(t) &= Q(0) R^T\cr
}
$$
This decomposes the motion into three elementary rotations. The method is fast,
accurate, reversible and symplectic. $Q$ is always orthogonal, because it
is updated only by rotations.

How does this fit into a full simulation of water? For each molecule the
variables are the $q$, the position of the center of mass;
$p$, the linear momentum; $Q$, and $\pi$. The total energy
has the form
$$ H = T^{\rm rotation}(\pi) + T^{\rm translation}(p) + V(q,Q) $$
together with the constraints $Q^T Q=Id$.
We apply a Verlet-like splitting into kinetic and potential parts. For each
molecule, we have

Potential:
$$\eqalign{
\dot q &= 0 \cr
\dot Q &= 0 \cr
\dot p &= -{\partial V\over\partial q} =: 
f\quad \hbox{\rm (force)} \cr
\dot\pi &= (Q^T f)\times x\quad \hbox{\rm (torque)}
}$$
where $x$ is the point at which the force acts. Since the positions and
orientations are here held constant, these equations are easy to solve.

Kinetic:
$$\eqalign{ \dot q &= p \cr
\dot Q &= Q \left( \hbox{\rm skew}(I^{-1}\pi)\right) \cr
\dot p &= 0 \cr
\dot \pi &= \pi\times I^{-1}\pi 
}
$$
where $I=\mathop{\rm diag}(I_1,I_2,I_3)$ is the inertia tensor.
The centers of mass undergo free, straight-line motion, while the
orientations move as a free rigid body. The latter could be solved
explicitly, although this has never been implemented in a production code;
in practice, we approximate its flow by the previously-given splitting method.

Composing these pieces gives an analogue of the Verlet method for this
non-canonical Hamiltonian system. The final method uses only one force
evaluation per time step, but is still explicit, symplectic, reversible,
and conserves total linear and angular momentum 
(because each piece does).
As expected for such a method, energy errors are bounded in time.
When implemented in the existing research code ORIENT using existing
error criteria, this method was about ten times faster than the
old method (2-level leapfrog with Bulirsch-Stoer extrapolation).

\medskip

{\narrower\small\noindent{\bf symplectic}\em, adj.,
from {\em symplegades} (myth. a pair of islands in the Euxine that twisted 
round and round, crashing into each other 'til the Argo passed between 
them), thence derive various scientific interpretations of the term 
{\em symplectic} (1. geol. a pair of minerals touching one another or 2. zoo. 
the jaw bone of certain fish), however the use in mathematics is actually 
directly from {\em symplegma} (a group of persons embracing or wrestling), 
the true precursor of {\em symplectic} (3. of or pertaining to an orgy of 
mathematicians). \hfil (Leimkuhler)\break

}

\references
{\small
\rsubsection{Hamiltonian systems}
For an introduction to Hamiltonian and Poisson systems, see
\be \item Peter J. Olver, {\em Applications of Lie groups to differential
equations}, Springer, New York, 1986
\ee
and to their dynamics,
\be	\item[2.] D.K. Arrowsmith and C.M. Place,
{\em Dynamical systems: differential equations, maps, and chaotic behavior},
Chapman \& Hall, New York, 1992.
\ee
\rsubsection{Splitting}
\be\item[3.] R.I. McLachlan and G.R.W. Quispel, 
Generating functions for dynamical systems with
symmetries, integrals, and differential invariants,
{\em Physica D \bf 112} (1997), 298--309.
\ee
\rsubsection{Molecular dynamics}
\be\item[4.]  M.P. Allen and D.J. Tildesley, {\em Computer Simulation
of Liquids}, Oxford Science, Oxford, 1987.
\item[5.] 
A. Dullweber, B. Leimkuhler, and R. McLachlan,
Split-Hamiltonian methods for rigid body
molecular dynamics,
{\em J. Chem. Phys. 107} (15), 5840 (1997). 
\ee
}

\chapter{Symmetries and reversing symmetries}

\rsection{Symmetries of ODEs}

A symmetry is a map $h:\R^n\to\R^n$ from phase space to itself, 
such as $x\mapsto -x$. 
In a system with symmetries, the vector field at the
two points $x$ and $h(x)$ are related to each other.
This is shown for the pendulum in Fig. \ref{cats.ps}. Under $q\to -q$,
arrows (the vector field) map to arrows: a symmetry. 
Under $p\to -p$, arrows map to arrows if we also reverse 
their direction: a reversing symmetry. The analogous properties
for flows can be seen by tracing along the flow lines.

Symmetries and reversing symmetries both reduce the possible
complexity of the phase portrait, and should be preserved.

In reversible Hamiltonian systems,
reversing symmetries are a bit easier to preserve than symplecticity
(although one can have both, if desired). For example, for simple
mechanical systems there are 
explicit, variable-step-size reversible methods.

Consider the ODE $\dot x = f(x)$ under the change of variables
$y = h(x)$. The new system is 
$\dot y = ((dh\cdot f)h^{-1})(y):=\widetilde f$.

\begin{definition}
$f$ has symmetry $h$ if $f=\widetilde f$, i.e., if $dh\cdot f = fh$.
$f$ has reversing symmetry $h$ if 
$f = -\widetilde f$, i.e., $dh\cdot f = -fh$, and
$f$ is called reversible.
\end{definition}
The notation $fh$ indicates composition, i.e., $(fh)(x) = f(h(x))$.

\begin{example} The pendulum.\rm
For the vector field
$$ f:\ \dot q = p,\ \dot p = -\sin q $$
we have
$$h_1:\ \widetilde q = q,\ \widetilde p = -p \Rightarrow  
\dot{\widetilde q} = -\widetilde p,\ \dot{\widetilde p} = \sin \widetilde q,
$$
---a reversing symmetry;
$$
h_2:\ \widetilde q = -q,\ \widetilde p = p \Rightarrow  
\dot{\widetilde q} = -\widetilde p,\ \dot{\widetilde p} = \sin \widetilde q,
$$---a reversing symmetry; and so
$$h_1\circ h_2:\ \widetilde q = -q,\ \widetilde p = -p \Rightarrow  
\dot{\widetilde q} = \widetilde p,\ \dot{\widetilde p} = -\sin \widetilde q
$$ is a symmetry.
So the pendulum has ``reversing symmetry group'' (the group of all
symmetries and reversing symmetries)
$$\Gamma = \left\{ \id, h_1, h_2, h_1\circ h_2\right\}.$$
\end{example}

In general, half of the elements of $\Gamma$ are symmetries, and the 
composition of two reversing symmetries is a symmetry.

We will use $S$ for a symmetry and $R$ for a reversing symmetry.

\begin{definition}
The fixed set of $S$ is $\fix(S) = \{ x: x = S(x)\}$.
\end{definition}
The fixed set is invariant under the flow of $f$.
So preserving symmetries is one way of staying on a submanifold.

\begin{example}\label{toda} A nonlinear symmetry. \rm
For the pendulum, the elements of $\Gamma$ were all linear maps. Here is
an example of a matrix ODE with a nonlinear symmetry. It is related to
the famous Toda lattice.
Let
$$\eqalign{ X,L_0 & \in \R^{n\times n}, \cr
\dot X & = B(X L_0 X^{-1})X, \cr
B(L) & = L_+ - L_-, \cr
}$$
where $L_+$ ($L_-$) is the upper (lower) triangular part of $L$.
This system has $h(X) = X^{-T}$ as a symmetry. The fixed set is 
$X=h(X) = X^{-T}$ or $XX^T=I$, i.e., $X\in O(n)$, the orthogonal group.
A symmetry-preserving integrator for this system would also have $O(n)$
as an invariant set.
\end{example}


\rsection{Symmetries of maps}

\begin{definition}
 A map $\psi$ has $h$ as a symmetry if $h\psi = \psi h$, i.e.,
$$ \psi = \N_h \psi := h^{-1}\psi h.$$
A map $\psi$ has $h$ as a reversing symmetry if $h\psi = \psi^{-1} h$, i.e.
$$ \psi = \N_h\I \psi := h^{-1}\psi^{-1} h.$$
\end{definition}

The important property of the operators ${\cal N}_h$, 
$\cal I$ is how they act on 
compositions of maps. $\N_h$ acts as an {\it automorphism}, i.e.
$$ \N_h(\psi_1\psi_2) = (\N_h\psi_1)(\N_h\psi_2),$$
while $\I$ acts as an {\it antiautomorphism}, i.e.
$$ \I(\psi_1\psi_2) = (\psi_1\psi_2)^{-1} = \psi_2^{-1} \psi_1^{-1} = (\I\psi_2)
(\I\psi_1) $$
For a map, having  a 
(reversing) symmetry is equivalent to being in the fixed set of
an (anti)automorphism. Therefore, we study how to construct maps
in such fixed sets. We shall see that for 
antiautomorphisms this is relatively simple, while for automorphisms it is
unsolved.

Thus, paradoxically, we know how to construct reversible integrators,
(which is good, because reversibility brings good long-time behavior,
e.g., through invariant tori), but not symmetric integrators,
which looks at first sight simpler.

\rsection{Covariance}
Why is Runge-Kutta called a linear method? One explanation is
that they are linearly {\em covariant}.
Consider methods $\psi$ which associate to each ODE $f$ 
a map $\psi_\tau(f)$, where $t$ is the timestep.

\begin{definition}
A method $\psi$ is $h$-covariant if the following diagram commutes.
$$ \matrix{ 
\dot x = f(x) & {x=h(y) \atop \longrightarrow} & \dot y = \widetilde f(y) \cr
\downarrow & & \downarrow \cr
\widetilde x = \psi_\tau(f)(x) & {x=h(y) \atop \longrightarrow} & 
\widetilde y = \psi_\tau(\widetilde f)\cr
}
$$
That is, if 
$$ \psi = \K_h\psi := h^{-1} \psi((dh\cdot f)h^{-1})h $$
where $\K_h$ is an automorphism.
\end{definition}

In words, we get the ``same'' integrator whether we take the
ODE in variables $x$ or $y$.
Notice that if $h$ is a symmetry of $f$, then 
$f=\widetilde f$, and hence $h$ is
a symmetry of $\psi$. So an $h$-covariant method is automatically 
$h$-symmetric, even if we don't know what the symmetry is!

So, we should classify methods by their covariance group.

\begin{example} Euler's method
$$\psi_\tau(f): x\mapsto x+\tau f(x)$$
(or, more generally, any Runge-Kutta method), is covariant under any affine
map $x = A y + b$.
\end{example}

\begin{example} The exact solution $\phi_\tau(f)$
is covariant under all maps $x=h(y)$.
\end{example}

\begin{example}
\label{sym_sep}
The splitting method for $\dot q = f(p)$, $\dot p=g(q)$,
$$ \psi_\tau(f): q' = q + \tau f(p),\ 
p' = p + \tau g(q')$$
is covariant under all maps $\widetilde q = h_1(q)$, $\widetilde p=h_2(p)$. It
is not covariant under even linear maps which couple the $q$, $p$ variables.
\end{example}

When composing symmetric or reversible methods, we can use the properties

\begin{enumerate}
\item The fixed sets of automorphisms form a group.
\item The fixed sets of antiautomorphisms form a sandwich.
\end{enumerate}

Property (1) is immediate, while property (2) follows from the following

\begin{lemma}\label{scovel} {\bf (The Generalized Scovel Projection.) }
Let $\A_-$ be an antiautomorphism with order 2, i.e. $\A_-^2=\id$. Let
$\chi = \A_-\chi$, i.e. $\chi\in\fix\A_-$. Then
$$ \psi\chi \A_-\psi\in\fix\A_-\quad\forall\psi.$$
\end{lemma}
\begin{proof}
$\A_-(\psi\chi\A_-\psi)=\A_-^2\psi\A_-\chi\A_-\psi = \psi \chi \A_-\psi$.
\end{proof}

\begin{example} If $\psi=\A_-\psi$, we get the sandwich property.
\end{example}

\begin{example} $\chi=\id\Rightarrow \psi\A_-\psi\in\fix\A_-$.  This gives
a way of constructing elements fixed under any antiautomorphism, starting from
any element.
\end{example}

\begin{example} With $\A_-\psi_\tau:=\psi^{-1}_{-\tau}$ and $\chi=\id$, 
this builds self-adjoint methods 
of the form $\psi_\tau\psi^{-1}_{-\tau}$.
\end{example}

\begin{example} With $\A_-\psi:=h^{-1} \psi^{-1}h$, $h^2=\id$, 
and $\chi=\id$,
this builds reversible methods of the form $\psi h^{-1} \psi^{-1} h$.
\end{example}

\rsection{Building symmetric methods}

This is unsolved except in two simple cases.

\begin{enumerate}
\item If the method is $h$-covariant and $h$ is a symmetry, then the method
is $h$-symmetric.
\item If the symmetry group $H$ is linear and the map $\psi$ belongs to a linear 
space, then we can average over $H$:
$$\overline\psi := {1\over |H|}\sum_{h\in H}\psi h$$
is $H$-symmetric.
\end{enumerate}

Therefore, we should try not to destroy symmetries in the first place,
by doing non-symmetric splittings, for example.

\rsubsection{Hamiltonian systems}
A symplectic integrator preserves an integral $I$ if it preserves
the symmetry associated with that integral, namely, the flow
of $J\nabla I$. 

\begin{example}Angular momentum \rm $I=q\times p$. Here $J\nabla I$
has the form $\dot q = a(q)$, $\dot p=b(p)$, whose flow has the
form of Example \ref{sym_sep}. Therefore, splitting methods for
$H={1\over2}p^2 + V(q)$ preserve angular momentum.
\end{example}

\begin{example}Quadratic integrals \rm are associated with
linear symmetries, so are preserved by any linearly covariant symplectic
integrator, such as the midpoint rule (by Noether's theorem).
\end{example}

\rsection{Building reversible methods}

Here the situation is much nicer.

\begin{theorem}
Let $\Gamma$ be a group of automorphisms and antiautomorphisms. 
Let $\phi$ be fixed under the automorphisms. Then
$$ \psi = \phi\A_-\phi$$
is fixed under $\A_g$ for all $g\in\Gamma$, where $\A_-$ is any antiautomorphism
in $\Gamma$.
\end{theorem}

For example, if all symmetries are
linear, then we can use this theorem to construct integrators having the
full reversing symmetry group.

\begin{figure}[tb]
  \begin{center}
    \fbox{
    \includegraphics[width=\picturewidth,keepaspectratio,
		clip]{timeseries.ps}
    \begin{picture}(0,10)
        \put (-202,130) {\footnotesize $\log_{10}\left|X X^T-I\right|$}
        \put (-80,15) {\footnotesize time step number}
    \end{picture}
	}
    \caption{The discrepancy for $\psi_3 =$ `01101001' applied to 
	Example \ref{toda}, where the base method $\psi_0$ is Euler.
	Notice how the symmetry error is drastically reduced every 2,
	then every 4, then every 8 time steps.}
    \label{timeseries.ps}
  \end{center}
\end{figure}

\rsection{Approximately preserving symmetries}

The composition used in Lemma \ref{scovel} is so nice that it would be nice
to use it for symmetries as well as reversing symmetries. Although it
doesn't eliminate the symmetry error, it does reduce it by one power
of the time step.

\begin{theorem}
Let $\Ap$ be an automorphism of order 2. 
Let $\psi_\tau$ be a method with $\psi = \Ap\psi + \O(\delta)$.
Let $\psi_1 := \psi \Ap\psi$. Then $\psi_1 = \Ap\psi_1 + \O(\tau\delta)$, 
where $\delta = \O(\tau)$.
\end{theorem}
\begin{proof} The proof is an illustration of backward error analysis
and manipulation of flows considered as exponentials.
We write the map $\psi$ as the flow of a vector field consisting
of a part $S$ which has the symmetry and a part $M$ which does not:
$$ \psi = \exp(\tau S + \delta M).$$
Therefore,
$$ \Ap\psi = \exp(\tau S + \delta N)$$
for some vector field $N$.
Now
$$\eqalign{ \psi_1(\Ap\psi_1)^{-1} 
& = (\psi\, \Ap\psi)(\Ap\psi\,\psi)^{-1} \cr
& = \psi\, \Ap\psi \, \psi^{-1}\, (\Ap\psi)^{-1} \cr
& = \exp([\tau S+\delta M,\tau S+\delta N]+\dots) \cr
& = \exp(\tau\delta[S,N-M]+\dots)\cr
}
$$
\end{proof}

Usually, the initial symmetry error $\delta$ will be $\O(\tau^{p+1})$ for a 
method of order $p$, and this composition reduces it to $\O(\tau^{p+2})$. 
The idea can be applied iteratively: if 
$$ \psi_{n+1} = \psi_n\Ap\psi_n,$$
then $\psi_n$ has symmetry error $\O(\tau^n\delta)$. This gives methods of
the form
$$\psi_2 = \psi \, \Ap\psi\, \Ap\psi\, \psi = \hbox{`0110'},$$
$$\psi_3 = \hbox{`01101001'},$$
and so on, given by the initial elements of the famous `Thue-Morse' sequence.

In the matrix example given previously, it is desired to leave the 
fixed set $XX^T=I$ invariant. This could be done by, e.g., the midpoint
rule, but this is implicit and, given the form of the ODE, very expensive.
Instead, one can use a simple explicit method for $\psi$, and
reduce the symmetry error to any desired order using $\psi_n$.
This leaves $X$ orthogonal to any desired order.


\references
{\small
\rsubsection{Background on symmetries}
\be\item  J.A.G. Roberts and G.R.W. Quispel,
Chaos and time-reversal symmetry:
order and chaos in reversible dynamical systems, {\em Phys. Rep. \bf 216}
(1992), 63--177.                                               
\ee
\rsubsection{Symmetric integration}
\be\item [2.]
R.I. McLachlan, G.R.W. Quispel and G.S. Turner,
Numerical integrators that preserve symmetries and reversing symmetries,
{\em SIAM J. Numer. Anal. \bf 35} (1998), no. 2, 586--599.
\item [3.]
A. Iserles, R. McLachlan, and A. Zanna,
Approximately preserving symmetries in the numerical integration
       of ordinary differential equations,
{\em  Eur. J. Appl. Math.}, to appear.
\item[4.] W. Huang and B. Leimkuhler, The adaptive Verlet method,
{\em SIAM J. Sci. Comput. \bf 18} (1997), 239--256. 
\ee
}

\chapter{Volume-preserving integrators}
Remember that the ODE
$$ { dx \over dt} = f(x) $$
is source-free (or divergence-free) if
$$ \nabla\cdot f = \sum_{i=1}^m {\partial f_i\over \partial x_i} = 0$$
for all $x$. Let $df=(\partial f_i/\partial x_j)$ be the derivative
of $f$ and $A=\partial \phi_\tau/\partial x$ be the Jacobian
of its flow. $A$ evolves according to
$$ {dA\over dt} = df A,\quad A(0)=Id$$
and one can show that
$$ {d\over dt}\det A = \tr(df)\det(A).$$
Consequently, if $\nabla\cdot f = \tr(df) = 0$, then $\det A=1$ for
all time; the flow is volume preserving.

The integrator $\psi_\tau$ is volume-preserving
(VP) if
$$ \det\left(\partial \psi_{\tau,i}\over\partial x_j\right)
= 1
$$ for all $x$.

\plot{vp1.ps}{Orbit of the 3D ``Arter flow'' computed with a
volume-preserving integrator. The orbit lies on a torus.}
\plot{vp2.ps}{As in Fig. \ref{vp1.ps}, 
but computed with a non-volume-preserving
Runge-Kutta method. The orbit topology is completely wrong.}

There are two general ways to construct VP integrators:
\begin{enumerate}
\item the splitting method, and
\item the correction method.
\end{enumerate}

\rsectionshort{VP splitting}{Volume-preserving splitting method}
Start with the system of ODEs
$$\eqalign{
{dx_1\over dt} &= f_1(x) \cr
& \vdots \cr
{dx_m\over dt} &= f_m(x) \cr
}$$
Now substitute
$$\eqalign{
f_m(x) & = \int {\partial f_m \over \partial x_m} \, dx_m \cr
& = -\int \sum_{i=1}^{m-1} {\partial f_i\over\partial x_i} \,
dx_m}$$
with appropriately chosen constants of integration, 
to get the equivalent form
$$\eqalign{
{dx_1\over dt} &= f_1(x) \cr
& \vdots \cr
{dx_{m-1}\over dt} &= f_{m-1}(x) \cr
{dx_m \over dt} &= -\sum_{i=1}^{m-1} \int {\partial f_i\over \partial
x_i}\, dx_m. \cr
}$$
Now we split $f$, writing $f$ as the sum of the $m-1$
vector fields
$$\eqalign{
{dx_i\over dt} &= 0\quad i \ne j,m \cr
{dx_j\over dt} &= f_j(x) \cr
{dx_m \over dt} &= -\int {\partial f_j\over \partial
x_j}\, dx_m \cr
}$$
for $j=1,\dots,m-1$.
Remarks:
\begin{enumerate}
\item Each of these $m-1$ vector fields is source-free.
\item We have split one big problem into $m-1$ small problems.
But we know the solution to each small problem! They each
correspond to a two-dimensional Hamiltonian system
$$\eqalign{
{dx_j \over dt}&= {\partial H_j\over \partial x_m} \cr
{dx_m \over dt} &= -{\partial H_j \over \partial x_j}\cr
}$$
with Hamiltonian $H_j:= \int f_j(x) dx_m$, treating
$x_i$ for $i\ne j,m$ as fixed parameters. Each of these
2D problems can either be solved exactly, or approximated
with any symplectic integrator $\psi_j$. Even though
$\psi_j$ is not symplectic in the whole space $\R^m$, it
{\em is} volume-preserving.
\end{enumerate}
A volume-preserving integrator for $f$ is then given by
$$ \psi = \psi_1 \circ \psi_2 \circ \dots \psi_{m-1}.$$

\begin{example} (for illustration only) \rm
The 3D Volterra system
$$\eqalign{
{dx_1\over dt} &= x_1(x_2-x_3) \cr
{dx_2\over dt} &= x_2(x_3-x_1) \cr
{dx_3\over dt} &= x_3(x_1-x_2) \cr
}$$
is source-free, and splits as
$$\left.
\eqalign{
{dx_1 \over dt} &= x_1 x_2 - x_1 x_3 \cr
{dx_2 \over dt} &= 0 \cr
{dx_3 \over dt} &= -x_2 x_3 + {1\over 2} x_3^2\cr
}\right\}
\tilde f_1,
$$
$$
\left.
\eqalign{
{dx_1 \over dt} &= 0 \cr
{dx_2 \over dt} &= x_2 x_3 - x_2 x_1 \cr
{dx_3 \over dt} &= x_1 x_3 - {1\over 2} x_3^2\cr
}\right\}
\tilde f_2,
$$
where volume-preserving integrators for $\tilde f_1$ and $\tilde f_2$
are given by the implicit midpoint rule,
$$ x' = x + \tau \tilde f_1\left({x + x'\over 2}\right),
\quad
x'' = x' + \tau \tilde f_2\left({x'+ x''\over 2}\right).
$$
Note that the $x_3^2$ terms were not in the original system, but
on combining the two steps they cancel.
\end{example}
The splitting is an example of a generating function method: we
construct source-free $f$'s without any side conditions.

\rsectionshort{VP correction}{The volume-preserving correction method}
The simplest case is the semi-implicit method
$$\eqalign{
x_1' & = x_1 + \tau f_1(\tilde x)\cr
& \vdots\cr
x'_{m-1} &= x_{m-1} + \tau f_{m-1}(\tilde x) \cr
x_m & =\int^{x_m'} J(\tilde x)\, dx_m' \cr
}$$
where
$$ J := \det\left({\partial x_i' \over \partial x_j}
\right)_{i,j=1,\dots,m-1}$$
and
$$ \tilde x = (x_1,\dots,x_{m-1},x'_m).$$
For a proof of consistency and volume-preservation, see [3].

\begin{example}(for illustration only) \rm
For the 3D Volterra system 
$$\eqalign{
{dx_1\over dt} &= x_1(x_2-x_3) \cr
{dx_2\over dt} &= x_2(x_3-x_1) \cr
{dx_3\over dt} &= x_3(x_1-x_2) \cr
}
$$
we get
$$\eqalign{
x_1' &= x_1 + \tau x_1 ( x_2 - x_3') \cr
x_2' &= x_1 + \tau x_2 ( x_3' - x_1) \cr
}$$
and
$$\eqalign{
J & = \left|\matrix{
{\partial x_1' \over \partial x_1}  &
{\partial x_1' \over \partial x_2}  \cr
{\partial x_2' \over \partial x_1} &
{\partial x_2' \over \partial x_2} \cr
}\right| \cr
& = 
\left| \matrix{ 1 + \tau (x_2 - x_3') & \tau x_1 \cr
-\tau x_2 & 1+\tau(x_3' - x_1) \cr
}\right| \cr
& = 1 + \tau(x_2 - x_1) + \tau^2(x_2 x_3' + x_1 x_3' - x_3'^2)
}
$$
and the last component of the method is
$x_3 = \int^{x_3'} J dx_3'$ or
$$
x_3 = x_3' + \tau x_3'(x_2-x_1) + {\tau^2\over2} \left(
x_2 x_3'^2 
+ x_1 x_3'^2 
- {2\over3}x_3'^3 \right) .
$$
\end{example}

\references
{\small
\rsubsection{The splitting method}
\be\item[1.] Feng Kang and Shang Zai-jiu, Volume-preserving
algorithms for source-free dynamical systems, {\em Numer.
Math. \bf 71} (1995), 451. 
\item[2.] R.I. McLachlan and G.R.W. Quispel, 
Generating functions for dynamical systems with
symmetries, integrals, and differential invariants,
{\em Physica D \bf 112} (1997), 298--309.
\ee

\rsubsection{The correction method}
\be\item[3.] G.R.W. Quispel, Volume-preserving integrators,
{\em Phys. Lett. \bf 206A} (1995), 26--30.
\ee

\rsubsection{Error growth}
\be\item[4.] G.R.W. Quispel and C.P. Dyt, Volume-preserving
integrators have linear error growth, {\em Phys. Lett. \bf 242A} 
(1998) 25--30.
\ee
}

\chapter[Integrals \& Lyapunov functions]{Integrators 
that preserve integrals and/or Lyapunov functions}

\begin{definition}
$I(x)$ is a (first) integral or a conserved quantity of an ODE if
$$ \frac{d}{dt}I(x(t))= 0$$
{\em for solutions $x(t)$} of the ODE ${dx\over dt}=f(x)$,
$x\in\R^m$.
\end{definition}

By the chain rule, this requires
$ \sum \frac{dI}{dx_i} \frac{dx_i}{dt} = 0$ 
for all solutions $x(t)$, or
\[
\sum \frac{dI}{dx_i} f_i(x) = f\cdot\nabla I = 0 
\]
for all $x$.

\begin{definition}
$V(x)$ is a Lyapunov function if
$$ \frac{d}{dt}V(x) \leq 0.$$
Equivalently,
$$ f \cdot \nabla V \leq 0  \hbox{\rm\ for all\ }x\in \R^m.$$
\end{definition}

ODEs with one or more first integrals occur frequently in physics.
Many examples come from two main classes:
\begin{enumerate}
\item {\bf Hamiltonian systems.} For example, the pendulum
$$\eqalign{
{dx_1 \over dt} & = x_2 \cr
{dx_2 \over dt} &= -\sin(x_1) \cr
}
$$
where $x_1$ is the angular position of the pendulum and
$x_2$ its angular momentum,
has the first integral
$$ I(x_1,x_2) = {1\over2}x_2^2-\cos(x_1).$$
\item {\bf Poisson systems.} For example, the free rigid
body with moments of inertia $I_1$, $I_2$, and $I_3$,
and angular momentum $\pi_1$, $\pi_2$, $\pi_3$ in body-fixed
coordinates,
$$\eqalign{
{d\pi_1\over dt} &= \left( {1\over I_2} - {1\over I_3}\right) \pi_2 \pi_3 \cr
{d\pi_2\over dt} &= \left( {1\over I_3} - {1\over I_1}\right) \pi_3 \pi_1 \cr
{d\pi_3\over dt} &= \left( {1\over I_1} - {1\over I_2}\right) \pi_1 \pi_2 \cr
}$$
has the first integral
$$ I(\pi_1,\pi_2,\pi_3) = \pi_1^2 + \pi_2^2 + \pi_3^2,$$
which is the body's total angular momentum.
\end{enumerate}

\wideplot{kepler.ps}{An orbit of the 2-body Kepler problem with
the eccentricity of the Hale--Bopp comet, computed
with an integral-preserving method (left) and Runge-Kutta
(right).}

\rsection{Preserving a first integral}
Before I present the theory, let me start with a picture.
Fig. \ref{kepler.ps} shows an orbit in the Kepler problem; it's an ellipse.
This remains true if you use an integral preserving method.
With a standard method such as Runge-Kutta, the orbit
spirals down to the origin.

The general method we present is as follows:
\begin{enumerate}
\item For every ODE with a first integral $I(x)$, we construct
an equivalent ``skew-gradient system;''
\item we discretize the skew-gradient system to a ``skew
discrete-gradient system;''
\item we show that the skew discrete-gradient system has
the same integral $I(x)$.
\end{enumerate}
More specifically,
\newpage
\begin{enumerate}
\item Given the system ${dx\over dt}=f(x)$ and first integral
$I(x)$ such that ${d\over dt}I(x(t))=0$, we construct
the equivalent skew gradient system 
$${dx\over dt} = S\nabla I,\quad S^T = -S;$$
\item we take the skew discrete gradient system
$$ {x'-x\over\tau} = S\DG I(x,x'),$$
where $\DG$ is a ``discrete gradient;''
\item we show that $I(x') = I(x)$.
\end{enumerate}

\rsubsection{Constructing an equivalent skew gradient system}
We want to solve $S\nabla I = f$ for the antisymmetric
matrix $S$, where $f$ and the
integral $I$ are given. Because $I$ is an integral,
$$ {d\over dt}I = {dx\over dt}\cdot \nabla I = f^T \nabla I = 0.$$
One solution is
$$ S = {f(\nabla I)^T - (\nabla I)f^T\over |\nabla I|^2}$$
but $S$ is not unique. In particular, if the critical
points of $I$ (points where $\nabla I(x)=0$ are nondegenerate,
then there is an $S$ which is as smooth as $f$ and $\nabla I$.
Sometimes, as in Poisson systems, $S$ is already known
and does not need to be constructed.

\rsection{Discrete gradients}
\rsubsection{Discretizing the skew-gradient system to
a skew discrete-gradient system}
A discrete gradient $\DG I$ is defined by the two axioms
$$ \eqalign{
I(x') - I(x) &= (\DG I)\cdot (x'-x) \cr
\DG I(x,x') & = \nabla I(x) + {\cal O}(x'-x). \cr
}
$$
For any such discrete gradient we can construct the
skew discrete-gradient system
$$ {x'-x\over\tau} = \widetilde S \DG I$$
where $\widetilde S$ is any consistent antisymmetric matrix, such
as $\widetilde S(x,x') = S((x+x')/2)$.

This discretization has the same integral $I$:

$$ \eqalign{
I(x') - I(x) &= (\DG I)\cdot(x'-x) \cr
&= \tau (\DG I)^T S (\DG T) \cr
&= 0 \cr
}
$$

\rsubsection{Examples of discrete gradients}
The problem is reduced to finding discrete gradients $\DG$ satisfying
the axioms. The general solution is known. Here are two particular
solutions:
\begin{enumerate}
\item {\bf (Itoh and Abe)}
$$ \DG I(x,x')_i := 
\big( I(x^{(i)}) - I(x^{(i-1)})\big)
/\big( x_i' - x_i\big),
$$
where
$$x^{(i)}:=\left(x'_1,\dots,x'_{i},x_{i+1},\dots,x_m\right).$$
\item {\bf (Harten, Lax and Van Leer)}
$$ \DG I(x,x') := \int_0^1 \nabla I\left(x+\xi(x'-x)\right)\, d\xi$$
\end{enumerate}

\plot{integral.ps}{Evolution of an integral $I(x)$ in a
3D system with three fourth-order methods. QT4: Integral preserving;
RK4: Classical Runge-Kutta; LM4: linear multistep.}

\rsubsection{Numerical example preserving one integral}
Consider the system
$$\eqalign{
{dx\over dt}&= yz + x(1+0.1 y^2 + y^3 + 0.1 y^5) \cr
{dy\over dt}&= -x^2 + z^2 - 0.1 x^2 y^2 \cr
{dz \over dt} &= -z - xy - y^3 z \cr
}
$$
which has first integral
$$I = {x^2\over2}+{y^4\over4}+y+{z^2\over2},$$
but is {\em not} Hamiltonian or Poisson. $I(x)$
has compact level surfaces, so staying on them means that
the numerical integration is stable.
This system is written as a skew-gradient system as
$$ 
{d\over dt}
\left( \matrix{ x \cr y \cr z }
\right) = 
\left( \matrix {
0 & x(0.1+y^2) & y \cr
-x(0.1 + y^2) & 0 & z \cr
-y & -z & 0 \cr
}\right)
\nabla I
$$
and a comparison between a skew-discrete gradient integrator
and two classical methods is shown in Fig. \ref{integral.ps}.

\rsection{Preserving a Lyapunov function}
An integral is a function that is preserved in time,
$dI/dt=0$. A Lyapunov function decreases in time, $dV/dt\le 0.$
These also arise frequently:
\begin{enumerate}
\item {\bf Gradient systems} (dynamical systems theory)
$$ {dx\over dt} = -\nabla V(x) $$
Here ${dV\over dt} = -(\nabla V)^T \nabla V = -|\nabla V|^2 \le 0$,
so $V$ is a Lyapunov function.
\item {\bf Systems with dissipation} For example, the
damped pendulum with friction $\alpha\ge 0$,
$$\eqalign{
{dx_1 \over dt} & = x_2 \cr
{dx_2 \over dt} &= -\sin(x_1)-\alpha x_2 \cr
}$$
has Lyapunov function $V(x_1,x_2) = {1\over 2}x_2^2 - \cos(x_1)$,
because
$$ {dV\over dt} = -\alpha x_2^2 \le 0.$$
\item {\bf Systems with an asymptotically stable fixed
point} Here the construction of the Lyapunov function
is a standard problem in dynamical systems.
\end{enumerate}
These systems can be discretized very similarly to 
systems with an integral. Namely,
\begin{enumerate}
\item Given the system ${dx\over dt} = f(x)$ with
Lyapunov function $V$, we construct the equivalent
``linear-gradient system'' 
$$ {dx\over dt} = L\nabla V$$
where $L$ is negative semidefinite, i.e. $v^T L v\le 0$ for
all vectors $v$;
\item we take the linear-discrete gradient system
$$ {x'-x\over\tau} = L\DG V(x,x'),$$
\item we show that $V(x')\le V(x)$.
\end{enumerate}
Note that $L$ is {\em not} necessarily symmetric. This is
important, because as the dissipation tends to zero,
we want $L$ to smoothly tend to an antisymmetric matrix,
to recover the integral-preserving case.

\rsubsection{Constructing an equivalent linear-gradient system}
We want to solve $L\nabla V=f$ where $f$ and $V$ are
given, ${dV\over dt}=f\cdot\nabla V\le 0$, and $L$ is
a negative semidefinite matrix. One solution is
$$ L = {f(\nabla V)^T - (\nabla V)f^T + (f\cdot \nabla V)I
\over |\nabla V|^2}.$$
One can check that $v^T L v = |v|^2 (f\cdot\nabla V)/|\nabla V|^2 
\le 0$, so that $L$ is negative semi-definite, and that
$L\nabla V = f$.
As with skew-gradient systems, the matrix $L$ is not unique,
and special care is required near critical points of $V$.
\rsubsection{The linear-discrete gradient system}
By analogy with the integral-preserving case, we check
that the linear-discrete gradient system has the same
Lyapunov function as the original system:
$$\eqalign{
V(x')-V(x) &= (\DG V) \cdot (x'-x) \cr
&= \tau (\DG V)^T L (\DG V) \cr
&\le 0.
}$$

\plot{lyapunov.ps}{The damped pendulum, computed with
a linear-discrete gradient method. Orbits spiral
in correctly even if the dissipation rate tends to zero.}

\rsubsection{Numerical example}
The damped pendulum
$$\eqalign{
{dx_1 \over dt} & = x_2 \cr
{dx_2 \over dt} &= -\sin(x_1)-\alpha x_2 \cr
}$$
with Lyapunov function $V(x_1,x_2) = {1\over2}x_2^2 - \cos(x_1)$,
can be written in the linear-gradient form
$$ {d\over dt} \left( \matrix{ x_1 \cr x_2 \cr } \right)
= \left( \matrix{0 & 1 \cr -1 & -\alpha \cr} \right)
\left( \matrix{\sin(x_1) \cr x_2 } \right)
= L\nabla V.$$
(Note that $L$ is negative semi-definite, because the
eigenvalues of its symmetric part are $0$ and $-\alpha$.)
Using the Itoh--Abe discrete gradient we get the discretization
$$\eqalign{
{1\over\tau} \left( \matrix{ x_1'-x_1 \cr x_2'-x_2 \cr } \right)
 & = L \DG V \cr
 & = \left(
 \matrix{ 0 & 1 \cr -1 & -\alpha} \right)
 \left(
 \matrix{ {-\cos(x_1')+\cos(x_1) \over x_1' - x_1 } \cr
 {x_2' + x_2 \over 2 } \cr
 }\right),
 }$$
whose phase portrait is sketched in Fig. \ref{lyapunov.ps}.
The behaviour of the non-dissipative Euler's method is
quite different. It increases energy near $p=0$ for
all time steps $\tau$. Globally, for $\tau>\sim 2\alpha$ orbits
move out across the separatrix; and for $\alpha<\sim
\tau<\sim 2\alpha$ there are spurious asymptotically stable 
periodic orbits inside the separatrix.

\rsubsection{Extensions and generalizations}
\begin{enumerate}
\item The above methods can be generalized to ODEs with
any number of integrals and/or Lyapunov functions.
\item There are discrete gradient methods of order 2,
but higher order is desirable. For systems with an integral,
this can be done by composition. For Lyapunov functions,
the maps form only a semigroup, so the order cannot be
increased beyond 2 by composition.
\item Given a system in skew- or linear-gradient form, the
matrix can be split, leading to 2D systems with
the same integral or Lyapunov function.
\item Satisfactory treatment of {\em nonautonomous} ODEs
is still an open problem.
\end{enumerate}

\references
{\small
Discrete gradients were introduced for Hamiltonian ODEs in
\be\item[1.] O. Gonzalez, Time integration and discrete
Hamiltonian systems, {\em J. Nonlinear Sci. \bf 6} (1996), 449--467.
\ee
and generalized as presented here in 
\be\item[2.] R.I. McLachlan, G.R.W. Quispel, and N. Robidoux,
Geometric integration using discrete gradients,
{\em Phil. Trans. Roy. Soc.} (to appear.)
\ee
of which a shorter version appears in
\be\item[3.] R.I. McLachlan, G.R.W. Quispel, and N. Robidoux,
A unified approach to Hamiltonian systems, Poisson systems,
gradient systems, and systems with Lyapunov functions and/or
first integrals, {\em Phys. Rev. Lett. \bf 81 } (1998) 2399--2403.
\ee
}

\end{document}