An old proof that the Morse Lemma loses only one order of regularity

calculus
Author

Nicolas Boumal

Published

April 3, 2026

Abstract

The Morse Lemma states that a smooth function can be locally deformed to a pure quadratic near a nondegenerate critical point. The common proofs to that effect lose two orders of regularity, but an old proof shows how to lose only one.

Marston Morse showed in his 1934 book (p. 172) that if \(f \colon \Rn \to \reals\) is real-analytic and if \(x^*\) is a nondegenerate critical point of \(f\) (that is, \(\nabla f(x^*) = 0\) and \(\nabla^2 f(x^*)\) is invertible) then \(\Rn\) can be locally deformed in a way that \(f\) becomes a quadratic.

More precisely, he showed there exists a local diffeomorphism \(\xi\) from a neighborhood of \(x^*\) to a neighborhood of the origin in \(\Rn\) such that \[ (f \circ \xi^{-1})(y) = f(x^*) + \sum_{i=1}^n s_i y_i^2, \] where \(s_1, \ldots, s_n \in \{\pm 1\}\) are the signs of the eigenvalues of the Hessian of \(f\) at \(x^*\). Moreover, \(\xi\) inherits the real analyticity of \(f\).

Palais (1963) improved the statement in 1963 to show that if \(f\) is merely \(C^p\) with \(p \geq 3\) then the conclusion above still holds, though \(\xi\) itself would also only be \(C^{p-2}\). That proof and a few others are the most common in recent literature.

It is natural to ask:

Do we really have to lose two orders of regularity?

Kuiper showed in 1966 that, in fact, there exists a \(\xi\) that is \(C^{p-1}\): it only loses one order of regularity. That has the extra benefit that the lemma applies if \(f\) is \(C^2\) as well. That proof is fairly sophisticated, in part because it also applies to certain infinite dimensional settings (as does the proof of Palais).

Appreciably, Ostrowski (1968) wrote a remarkably simple proof that obtains the same regularity improvement for the finite dimensional case.

This blog post sketches that proof, with hopefully enough details to be convincing.

Ostrowski’s improved Morse Lemma

Without loss of generality, we can place the critical point \(x^*\) at the origin, so \(x^* = 0\), and we may as well assume \(f(0) = 0\). The statement of interest is:

Lemma 1 (Morse Lemma à la Ostrowski/Kuiper) Let \(f \colon \Rn \to \reals\) be \(C^p\) with \(p \geq 2\). Assume \(f(0) = 0\) and \(\nabla f(0) = 0\). Further assume \(\nabla^2 f(0)\) is invertible.

Then, there exist neighborhoods \(U\) and \(V\) of \(0\) in \(\Rn\) and a \(C^{p-1}\) diffeomorphism \(\xi \colon U \to V\) such that \[ (f \circ \xi^{-1})(y) = s_1 y_1^2 + \cdots + s_n y_n^2, \] where \(s_1, \ldots, s_n \in \{\pm 1\}\) are the signs of the \(n\) eigenvalues of \(\nabla^2 f(0)\). Moreover, \(\xi\) is \(C^p\) on \(U \backslash \{0\}\).

The proof “bootstraps” from Morse’s original proof that did the job for the case where \(f\) is real analytic.

A technical insight

As Ostrowski notes, a key technical insight comes from the following simple technical fact:

Lemma 2 Assume \(f, g\) are two maps from \(\Rn\) to linear spaces such that the product \(fg \colon x \mapsto f(x)g(x)\) is well defined (for example, \(f\) is real valued and \(g\) is matrix valued). If \(f(0) = 0\) and \(f\) is differentiable at \(0\), then \(fg\) is differentiable at \(0\) if (and only if) \(g\) is continuous at \(0\).

Proof. Let’s write the proof for the case \(f, g \colon \reals \to \reals\).

Notice the subtlety: if we could assume that \(g\) is differentiable at \(0\), then we could simply write \[ (fg)'(0) = f'(0)g(0) + f(0)g'(0) = f'(0)g(0). \] The end result does not involve \(g'(0)\), but the computation relies on the existence of \(g'(0)\), so this doesn’t work as a proof.

Rather, let’s go back to the definition: \((fg)'(0)\) exists (and is equal to) the following limit if that limit exists: \[ \lim_{t \to 0} \frac{(fg)(t) - (fg)(0)}{t} = \lim_{t \to 0} \frac{f(t)g(t)}{t} = \lim_{t \to 0} \frac{(f(t) - f(0))}{t} g(t) = f'(0) \lim_{t \to 0} g(t). \] Clearly, the limit exists if and only if \(g\) is continuous at \(0\), as announced.

That lemma is how Ostrowski “recoups” a lost derivative in the argument below: even if \(g\) is not differentiable at \(0\), the product \(fg\) will be differentiable at \(0\) owing to \(f(0) = 0\).

Before we get to that, we will use the lemma a first time to confirm that the regularity loss in Kuiper and Ostrowski’s result cannot be improved.

Some loss of regularity is inevitable

In his paper, Ostrowski also mentions in passing an argument of Kuiper to confirm that we must lose at least one order of regularity in general.

To see it, consider the 1-D function \(f \colon \reals \to \reals\) defined as: \[ f(x) = \begin{cases} x^2 & \textrm{if $x \leq 0$,} \\ (x + x^2)^2 & \textrm{if $x \geq 0$.} \end{cases} \] The positive branch expands to \((x + x^2)^2 = x^2 + 2x^3 + x^4\). By studying the function at \(x = 0\), you can confirm that \(f\) is \(C^2\) but not \(C^3\). (Notice that if we choose \((x + x^r)^2\) instead with \(r > 2\), then the regularity goes up.) Also, \(f(0) = 0\), \(f'(0) = 0\) and \(f''(0) = 2 > 0\).

Consider any invertible function \(\xi\) on a neighborhood of \(0\) such that \((f \circ \xi^{-1})(y) = y^2\), as the Morse Lemma would provide. Equivalently, \(f(x) = \xi(x)^2\); in particular, notice \(\xi(0) = 0\). How regular can this \(\xi\) be?

We shall confirm in a moment (by proving Lemma 1) that \(\xi\) can be \(C^1\). In that case, we can differentiate \(f(x) = \xi(x)^2\) and get: \[ f'(x) = 2 \xi(x) \xi'(x). \] For contradiction, assume \(\xi\) is \(C^2\). Then, we may differentiate once more and obtain \[ f''(x) = 2 (\xi'(x))^2 + 2 \xi(x) \xi''(x). \] The first term is differentiable at \(0\). Importantly, the second term is too! That follows from the technical Lemma 2 because \(\xi(0) = 0\), \(\xi'(0)\) is defined, and \(\xi''\) is continuous at \(0\). We deduce that \(f''\) is differentiable at \(0\): a contradiction.

Proof sketch

To streamline notation, let us assume all eigenvalues of \(\nabla^2 f(0)\) are positive. (It really does not change anything meaningful.)

Bootstrapping from Morse’s original result

The first step in Ostrowksi’s argument is to apply Taylor’s theorem to \(f\): \[ f(x) = q(x) + r(x) \] for all \(x\) in some neighborhood of \(0\), where \(q\) is a polynomial of degree \(p\) and \(r\) satisfies \(r(x) = o(|x_1|^p + \cdots + |x_n|^p)\). Of course, \(r\) itself is \(C^p\). Also, \(\nabla^2 f(0) = \nabla^2 q(0)\) because \(p \geq 2\).

The second step is to observe that, since \(q\) is a polynomial, it is real analytic and so we may apply Morse’s original result to \(q\) (not to \(f\)!). Thus, there exists a (real-analytic) diffeomorphism \(\psi\) on a neighborhood of \(0\) such that \[ (q \circ \psi^{-1})(y) = \|y\|^2. \] Plugging this back into the expression for \(f\) reveals \[ (f \circ \psi^{-1})(y) = \|y\|^2 + (r \circ \psi^{-1})(y). \]

Focusing on the remainder term

Let us work on that second term. Define \[ h(y) = 1 + \frac{(r \circ \psi^{-1})(y)}{\|y\|^2}. \] We will worry about its exactly regularity later. For now, notice that \(h\) is \(C^p\) away from zero, and that it is continuous at \(0\) because \(\psi\) is a diffeomorphism and \(\psi(0) = 0\) and \(r\) goes to zero strictly faster than \(\|y\|^2\) (again, because \(p \geq 2\)). Since \(h(0) = 1\), we also see that \(h\) is strictly positive in a neighborhood of \(0\).

With that notation in hand, we can write \[ (f \circ \psi^{-1})(y) = h(y) \|y\|^2. \]

Introducing the new diffeomorphism \(\xi\)

Since \(h\) is strictly positive close to zero, we can define the map \(\xi\) we’ve been searching for all along: \[ \xi(x) = g(x) \psi(x) \quad \textrm{ with } \quad g(x) = \sqrt{h(\psi(x))}. \] It is just a rescaling of the initial diffeomorphism \(\psi\).

Since \(\psi(0) = 0\) and \(g(0) = 1\), we see that \(\xi\) inherits some of its regularity from \(g\). Assume for now that \(g\) is differentiable (as we shall argue in a moment). Then, \[ \D\xi(0) = \D g(0) \cdot \psi(0) + g(0) \cdot \D\psi(0) = \D\psi(0), \] which is of course invertible. It follows that \(\xi\) is a local diffeomorphism around \(0\). And it meets our needs because \[ \|\xi(x)\|^2 = h(\psi(x)) \|\psi(x)\|^2 = (f \circ \psi^{-1})(\psi(x)) = f(x), \] that is, \((f \circ \xi^{-1})(z) = \|z\|^2\).

Checking the regularity of \(\xi\)

How do we assess the regularity of \(\xi\)? Let us start by looking at the regularity of \(g\). It’s all about this fraction: \[ \frac{(r \circ \psi^{-1})(y)}{\|y\|^2} = \frac{r(x)}{\|\psi(x)\|^2}. \] Away from (but near) \(0\), since \(\psi\) is real analytic and \(r\) is \(C^p\) and \(h\) is strictly positive, it is clear that \(g\) (and hence \(\xi\)) is \(C^p\). At \(0\), we could reason that \(g\) loses at most two orders of regularity compared to \(r\) due to the division by \(\|y\|^2\), because the gradient and Hessian of \(r\) at \(0\) are zero. This readily provides that \(\xi\) is \(C^{p-2}\) in a neighborhood of \(0\).

The final step: recouping a lost derivative

The beauty of Ostrowski’s argument is that we can continue from here. For all we know, \(g\) really does lose two orders of regularity at \(0\). However, \(\xi\) does not! Indeed, remember \[ \xi(x) = g(x) \psi(x). \] Differentiate this \(p-2\) times (as we know is legitimate): \[\begin{align*} \D\xi(x) & = \D g(x) \cdot \psi(x) + g(x) \D\psi(x), \\ \D^2 \xi(x) & = \D^2 g(x) \cdot \psi(x) + 2 \D g(x) \cdot \D\psi(x) + g(x) \D^2 \psi(x), \\\ \vdots & = \vdots \\ \D^{p-2} \xi(x) & = \D^{p-2} g(x) \cdot \psi(x) + \cdots \end{align*}\] There is only one term that involves \(\D^{p-2} g(x)\) (as noted explicitly). All other terms (hidden behind the ellipse) involve lower order derivatives of \(g\). Those can be differentiated at least once more at \(0\): no problem there. Thus, the whole question reduces to:

Can we differentiate \(\D^{p-2} g(x) \cdot \psi(x)\) once at \(x = 0\)?

This is where Lemma 2 kicks in. Yes we can: \(\psi(0) = 0\) and \(\D\psi(0)\) is well defined, and \(\D^{p-2} g\) is continuous at \(0\), hence \(\D^{p-2} g(x) \cdot \psi(x)\) is differentiable at \(x = 0\). And we already knew it is differentiable away from (but near) \(0\). Hence, \(\xi\) is \(C^{p-1}\) around \(0\), as announced.

That’s it.

The end

Somehow, a student and I had a hard time even confirming that the improved lemma is true after a fair bit of googling and asking the usual LLMs etc. It is often stated without proof.

Ostrowski’s paper attracted less than 20 citations in nearly 60 years, and the only way we even became aware of it is because it was mentioned as “to appear” in a footnote of a paper by Kuiper that my student was only able to access by going in person to the university library (like in the old days!).

Thank you, Ostrowski, for taking the time to write that paper in 1968!

References

Ostrowski, Alexander. 1968. “On the Morse–Kuiper Theorem.” Aequationes Mathematicae 1 (1–2): 66–76. https://doi.org/10.1007/bf01817558.
Palais, Richard S. 1963. Morse Theory on Hilbert Manifolds.” Topology 2 (4): 299–340. https://doi.org/10.1016/0040-9383(63)90013-2.

Citation

BibTeX citation:
@online{boumal2026,
  author = {Boumal, Nicolas},
  title = {An Old Proof That the {Morse} {Lemma} Loses Only One Order of
    Regularity},
  date = {2026-04-03},
  url = {www.racetothebottom.xyz/posts/Morse-lemma-regularity/},
  langid = {en},
  abstract = {The Morse Lemma states that a smooth function can be
    locally deformed to a pure quadratic near a nondegenerate critical
    point. The common proofs to that effect lose two orders of
    regularity, but an old proof shows how to lose only one.}
}