A different way to define embedded submanifolds: Part I

Smooth retracts are smooth manifolds

manifolds
Author

Nicolas Boumal

Published

February 10, 2026

Modified

February 21, 2026

Abstract
There are several ways to define manifolds, and the definition we choose conditions all the work that follows. One little known definition may just be the best entry point yet: an embedded submanifold of \(\Rd\) is nothing but the image of a smooth idempotent map.

Intuitively, a (smooth) embedded submanifold of \(\Rd\) is a subset \(M \subseteq \Rd\) which is “smooth” in the sense that, locally, “it can be smoothly deformed to a linear subspace, and back.” Think of a sphere or the shell of a doughnut, but not a cube or a cross.

We can make this concrete in several equivalent ways. The definition we choose will affect everything else we do (from defining subordinate concepts to proving theorems about them), thus we should choose wisely.

This blog post is about a classical yet little known definition. Notice how little prior background it requires: anyone who took calculus can take this in.

Definition 1 (as a retract) A connected subset \(M \subseteq \Rd\) is an embedded submanifold of \(\Rd\) if there exist an open set \(U \subseteq \Rd\) and a smooth map \(\pi \colon U \to U\) such that:

  1. \(M = \pi(U)\), and
  2. \(\pi(x) = x\) for all \(x \in M\).

For example, this definition deems (correctly) that the unit sphere is an embedded submanifold of \(\Rd\) because \(\pi(x) := x/\|x\|\) is an appropriate choice with \(U = \Rd \backslash \{0\}\).

Definition 1 is not standard, yet it is equivalent to all the ones you may have seen (see below): we prove as much in this post.

I became aware of this fact via a book of Michor (2008), where it is stated as Theorem 1.15. It also appears in a book by Hirsch (1976) as a starred exercise (namely, Ex. 2 in Sec. 1.2). This topic is further discussed in a post on Math Overflow, which cites a 1987 paper by Lawvere (p. 267) and links to a proof on nLab (which I also expanded here).

My motivation for writing this is that, if you need to teach someone what a manifold is from scratch, or if you need to build up some definitions as brief background for a paper, this is a credible option: see Part II.

Updates Feb. 21, 2026. Many thanks to Marc Troyanov for several useful remarks on the first version of this post!

Quick comments

Throughout, “smooth” means \(C^\infty\). It all generalizes to \(C^p\).

In topology, a map \(\pi\) as in Definition 1 is called a (topological) retraction, and \(M\) is a retract of \(U\) through \(\pi\). Given that the word “retraction” refers to a different concept in optimization on manifolds, we might call \(\pi\) a defining retraction for example.

One subtlety is that the definition requires \(M\) to be connected. We may replace that requirement with the condition that the rank of \(\D\pi(x)\) be constant for all \(x \in M\): this is automatically the case on each connected component separately, so the requirement is that is should be the same for all components. (Later, we reason that this rank is the dimension of \(M\).)

The map \(\pi\) is idempotent and \(M\) is its fixed points

Consider a map \(\pi \colon U \to U\) as in Definition 1. In particular, if \(x\) is in \(M\), then \(\pi(x) = x\). The other way around, if \(\pi(y) = y\) (for any \(y \in U\)) then \(y\) is in the image of \(\pi\), which is \(M := \pi(U)\). Thus, \[ M = \{ y \in U : \pi(y) = y \}. \] In words: \(M\) is the set of fixed points of \(\pi\).

Moreover, \(\pi\) is idempotent because for all \(y \in U\) it holds that \(\pi(y)\) is in \(M\) so \(\pi(\pi(y)) = \pi(y)\). Succinctly: \[ \pi \circ \pi = \pi. \] In fact, we could even have this instead of the second condition in Definition 1. Indeed, if \(\pi \circ \pi = \pi\) and \(M := \pi(U)\), then each \(x \in M\) is of the form \(x = \pi(y)\) for some \(y \in U\) so \(\pi(x) = \pi(\pi(y)) = \pi(y) = x\).

Thus, the main message of this post can be summarized as:

The connected embedded submanifolds of \(\Rd\) are the images of smooth idempotent maps.

Contrast to more classical definitions

The intrinsic way to define submanifolds is to define first what a manifold is (through charts and atlases), then to define submanifolds as follows (see the textbook by Lee (2012), on page 98 (!)):

An embedded submanifold of \(\Rd\) is a subset \(M \subseteq \Rd\) that is a manifold in the subspace topology, endowed with a smooth structure with respect to which the inclusion map \(i \colon M \to \Rd \colon x \mapsto i(x) = x\) is a smooth embedding.

This is a mouthful; it’s not particularly illuminating; and most importantly: it requires one to first ascertain that \(M\) is a manifold in its own right (thus, with charts and atlases) before moving on to checking all the other properties that require their own definitions. This is hardly a good entry point.

The next definition more closely parallels the intuition conveyed by the first sentence of this post. I like it because it convincingly “captures” the idea of smoothness. Explicitly, it requires that, locally around each \(x\), the space \(\Rd\) can be deformed smoothly in a way that a patch of \(M\) around \(x\) (namely, \(M \cap U\)) becomes flat; and that deformation can be undone smoothly as well.

The fact that this is equivalent to the former definition is standard (Lee 2012, Prop. 5.8). It is a fairly common entry point.

A neighborhood of a point or set is an open set which contains it. A diffeomorphism is a bijective map that is smooth and whose inverse is smooth too.

Definition 2 (via diffeomorphisms) A subset \(M \subseteq \Rd\) is an embedded submanifold of \(\Rd\) if there exists an integer \(r \geq 0\) with the following property:

For all \(x \in M\) there exist:

  1. A neighborhood \(U\) of \(x\) in \(\Rd\),
  2. An open set \(V\) in \(\Rd\), and
  3. A diffeomorphism \(\psi \colon U \to V\)

such that \(\psi(M \cap U) = \{ z \in V : z_{r+1} = \cdots = z_d = 0 \}\).

Yet another approach is to require that \(M\) is locally the zero-set of a smooth function together with a (truly necessary) rank condition. Here too, it is well known that this is equivalent to the definition above (Lee 2012, Prop. 5.16) (via the Inverse Function Theorem), hence we can also use this one as the entry point to differential geometry.

This is what I chose for my book.

Definition 3 (as level sets) A subset \(M \subseteq \Rd\) is an embedded submanifold of \(\Rd\) if either (a) \(M\) is open in \(\Rd\), or (b) there exists an integer \(k \geq 1\) with the following property:

For all \(x \in M\) there exist a neighborhood \(U\) of \(x\) in \(\Rd\) and a smooth map \(h \colon U \to \Rk\) such that

  1. \(M \cap U = \{y \in U : h(y) = 0\}\), and
  2. \(\D h(x)\) has rank \(k\).

One advantage of this definition is that manifolds often come up naturally as level sets of functions \(h\) (called local defining functions). Think for example of a sphere, defined by \(h(x) := x^\top x - 1 = 0\).

Both Definition 2 and Definition 3 are still a mouthful, but at least they are phrased entirely using terms that will be familiar to anyone who took a first course in calculus and linear algebra.

In spite of these qualities, a somewhat annoying aspect of both definitions is that, in general, \(M\) cannot be defined with a a single diffeomorphism (because if it can be, then it must be diffeomorphic to \(\reals^r\)) nor with a single local defining function \(h\) (because if it can be, then it must be orientable (Lee 2012, Prop. 15.23)).

In contrast, Definition 1 describes the entire manifold \(M\) with a single, nice map \(\pi\). Downstream, this notably has the effect that further definitions and proofs need not be articulated as “for each \(x\), consider a \(\psi\) or an \(h\) etc.”: we can process the whole manifold in one shot. But there is more to it: see next.

There is a flip side: if we can show that \(M\) is smooth locally around each point (by building a different map \(\pi\) for a “patch” around each point), it does not easily follow from Definition 1 that \(M\) is smooth as a whole.

There is a \(\pi\) for each submanifold

If \(M\) is an embedded submanifold of \(\Rd\) as per one of the standard definitions, then we can build a map \(\pi\) that conforms to Definition 1 in two (related) ways:

  1. Via the Tubular Neighborhood Theorem (TNT): see (Lee 2012, Prop. 6.25) specifically.

  2. Via metric projection: let \(\pi(y) := \arg\min_{x \in M} \|x - y\|^2\), with sufficiently small domain \(U\) (a neighborhood of \(M\)) so that \(\pi\) is well defined and smooth.

That second construction has the benefit of being actionable (for example, \(\pi(y) = y/\|y\|\) is exactly that map for the sphere, and it is easily figured out). As it turns out, the fact that metric projection is (locally) smooth “falls out of” the proof of the TNT, so they are really one and the same (Lee 2012, Pb. 6-5). See below for more on this.

Accordingly, to confirm that Definition 1 is equivalent to the standard definitions, we only need the other direction: given a defining retraction \(\pi\), show that \(M\) is an embedded submanifold in the standard sense. The next three sections do exactly that.

Notice something here: the above means that the TNT is “baked in” Definition 1. Thus, we may expect that all things that are difficult to define or prove without the TNT in the standard approaches should be much simpler with the new definition. That is exactly what happens in Part II.

The differential of \(\pi\) is a projector

Since \(\pi \circ \pi = \pi\), for all \(y \in U\) we have by the chain rule: \[ \D \pi(\pi(y)) \circ \D \pi(y) = \D \pi(y). \] In particular, for all \(x \in M\) we have \(\pi(x) = x\) so that \[ \D \pi(x) \circ \D \pi(x) = \D \pi(x). \] Thus, \(\D \pi(x)\) is an idempotent linear map: a projector.

A linear map \(A\) is a projector if \(A^2 = A\). Then, if \(u\) is in \(\im A\), there exists \(v\) such that \(u = Av\) and so \(Au = A^2 v = Av = u\). Therefore, \(\im A = \{u : Au = u\}\).

As a side question to build intuition, we may ask: what does it project onto?

Well, if \(v\) is in the image of \(\D\pi(x)\), then \(\D\pi(x)[v] = v\). Let \(c(t) = \pi(x + t v)\): this is a smooth curve in \(M\) which passes through \(x\) at \(t = 0\) with velocity \(c'(0) = \D \pi(x)[v] = v\).

Conversely, if \(v\) is the velocity at \(t = 0\) of a smooth curve \(c\) in \(M\) with \(c(0) = x\), then \(c(t) = \pi(c(t))\) for all \(t\) (because \(c(t)\) is in \(M\)), so \[ v = c'(0) = \D \pi(c(0))[c'(0)] = \D\pi(x)[v]. \]

Combined, we found that:

The image of \(\D \pi(x)\) is the set of velocities of all smooth curves in \(M\) as they pass through \(x\).

For this reason:

In Part II, we shall call \(\im \D\pi(x)\) the tangent space to \(M\) at \(x\). Accordingly, the dimension of \(M\) at \(x\) will be defined as the rank of \(\D \pi(x)\), that is, the dimension of that tangent space.

In the next section, we see that \(M\) has the same dimension at each point \(x\), hence we shall simply call this the dimension of \(M\).

The map \(\pi\) has constant rank

Any map \(\pi\) that conforms to Definition 1 has a differential of constant rank, at least in a neighborhood \(M\). More precisely, the following is true:

Lemma 1 Let \(\pi \colon U \to U\) be smooth and idempotent (\(\pi \circ \pi = \pi\)). Then, there exists a neighborhood \(V \subseteq U\) of \(\pi(U)\) such that \(y \mapsto \rank\,\D\pi(y)\) is constant in each connected component of \(V\).

Proof. The argument here exactly tracks the proof in (Michor 2008, Thm. 1.15).

Let \(M = \pi(U)\). Then:

  1. For all \(x \in M\), we have \(\ker(I - \D\pi(x)) = \im \D\pi(x)\).

Indeed, recall \(A := \D\pi(x)\) is a projector (\(A^2 = A\)). For a projector, \(\im A = \{ u : Au = u\} = \ker(I-A)\).

  1. Therefore, for all \(x \in M\) we have from the rank-nullity theorem that \[\begin{align*} d & = \dim \ker(I - \D\pi(x)) + \dim \im(I - \D\pi(x)) \\ & = \rank\,\D\pi(x) + \rank(I - \D\pi(x)). \end{align*}\]

Neither of these ranks can suddenly drop, yet their sum is constant. Thus, they must both be constant on each connected component of \(M\). For convenience, assume \(M\) is connected (the argument easily extends). Let \(r\) be the rank of \(\D\pi(x)\) for all \(x \in M\).

  1. The rank of \(\D\pi(y)\) is at most \(r\) for all \(y \in U\).

The follows from idempotence, as then for all \(y\) we have \(\D\pi(y) = \D(\pi \circ \pi)(y) = \D\pi(\pi(y)) \circ \D\pi(y)\). Since \(\pi(y)\) is in \(M\), we see that \(\D\pi(\pi(y))\) has rank \(r\), and so \(\D\pi(y)\) must have rank at most \(r\).

  1. The rank of \(\D\pi(y)\) is at least \(r\) for all \(y\) in some neighborhood $V $ of \(M\).

This is because \(\rank\,\D\pi(x) = r\) for all \(x \in M\) and (again) the rank cannot suddenly drop: it can only stay constant or (in general) suddenly increase.

  1. Combined, the last two claims show that \(\rank\,\D\pi(y) = r\) for all \(y \in V\).

The proof is complete.

The lemma notably implies that \(\im \D\pi(y) = \im \D\pi(\pi(y))\) for all \(y \in V\). Moreover, combined with the fact that \(M := \pi(U)\) is a smooth manifold of dimension \(r\) (which we prove in the next section), the lemma also implies that \[ \pi \colon V \to M \] is a smooth submersion, an open map and a quotient map (Lee 2012, Prop. 4.28). (But we cannot use this yet.)

The image of \(\pi\) is smooth

The argument here corresponds to the in-between-the-lines part of the proof in (Michor 2008, Thm. 1.15), as per an exchange with Peter Michor on Math Overflow.

Let \(\pi \colon U \to U\) be smooth and idempotent such that \(M := \pi(U)\) is connected. If need be, use Lemma 1 to replace \(U\) with a smaller neighborhood of \(M\) such that \(\D\pi(y) =: r\) for all \(y \in U\).

The goal is to show \(M\) is an embedded submanifold of \(\Rd\) “in the usual sense”. Specifically, we target Definition 2 (because it is so close to the intuitive notion of smoothness).

Fix a point \(x \in M\). Notice \(\pi(x) = x\). By the Constant Rank Theorem (Lee 2012, Thm. 4.12), there exist:

If \(M\) were a level set of the constant rank map \(\pi\), then the CRT would readily imply that \(M\) is a manifold. But here \(M\) is the fixed points of \(\pi\), so we need to work more.
  • Two neighborhoods \(V, W\) of \(x\) in \(U\) such that \(\pi(V) \subseteq W\), and
  • Two diffeomorphisms \(\varphi \colon V \to \Rd\) and \(\psi \colon W \to \Rd\)

such that \[ \tilde\pi := \psi \circ \pi \circ \varphi^{-1} \colon \Rd \to \Rd \] is the linear projection \(\tilde\pi(z_1, \ldots, z_r, z_{r+1}, \ldots, z_d) = (z_1, \ldots, z_r, 0, \ldots, 0)\).

The image of \(\tilde \pi\) is a linear subspace of \(\Rd\) of dimension \(r\). Consider the pre-image of that subspace through \(\psi\): \[ Z := \psi^{-1}(\im \tilde\pi) = \{ y \in W : \psi(y) \in \im \tilde\pi \} = \{ y \in W : \psi(y) = (\tilde\pi \circ \psi)(y) \}. \] (It is already clear from Definition 2 that \(Z\) is an embedded submanifold of \(\Rd\), of dimension \(r\).) For convenience, restrict it to \[ Z' := Z \cap V, \] so that \(Z'\) is included in \(V \cap W\). We aim to show that this is a “patch of \(M\)” around \(x\).

For all \(y\) in \(V\), notice that \[ \psi(\pi(y)) = (\tilde\pi \circ \varphi)(y). \] Here comes the key step: since \(\tilde\pi \circ \tilde\pi = \tilde\pi\), it follows that \[ (\tilde\pi \circ \psi)(\pi(y)) = (\tilde\pi \circ \varphi)(y) = \psi(\pi(y)). \]

From here, we see two things:

  1. If \(y\) is in \(M \cap (V \cap W)\), then \(\pi(y) = y\) so that \((\tilde\pi \circ \psi)(y) = \psi(y)\), that is, \(y\) is in \(Z'\).

  2. If \(y\) is in \(Z'\), then \(\psi(y) = (\tilde\pi \circ \psi)(y)\) so that \(y = (\psi^{-1} \circ \tilde\pi \circ \psi)(y) = (\pi \circ \varphi^{-1} \circ \psi)(y)\). Thus, \(y\) is in the image of \(\pi\), which is \(M\).

Therefore, \(M \cap (V \cap W) = Z'\), and \(\psi|_{V \cap W}\) is a diffeomorphism that “straightens out” the patch \(M \cap (V \cap W)\). This is true around each \(x \in M\), thus \(M\) is an embedded submanifold of \(\Rd\) in the sense of Definition 2.

Could we have built a single defining function \(h\)?

We argued early on that \(M = \{y \in U : \pi(y) = y\}\). Thus, it is tempting to define \[ h(y) := y - \pi(y) \] and to try to show that this is a defining function for \(M\), connecting to Definition 3 (though in a more general sense where \(h\) need not have maximal rank, but only constant rank in a neighborhood of \(M\), as per the Constant Rank Theorem, see (Lee 2012, Thm. 5.12)).

Clearly, \(M = h^{-1}(0)\), so it is just a matter of checking that \(\D h\) has constant rank on a neighborhood of \(M\). What is more, we saw in Lemma 1 that \(\D\pi\) has constant rank on a neighborhood of \(M\), so perhaps \(\D h\) does too? Unfortunately, it does not.

To see clearly why that is, consider the example of a sphere \(\Sd \subset \Rd\):

  • \(U = \Rd \backslash \{0\}\).
  • \(\pi(y) = y/\|y\|\) for all \(y \in U\).
  • \(h(y) = y - \pi(y) = y - y/\|y\|\) for all \(y \in U\).

The differential of \(h\) at \(y\) is \[ \D h(y) = I - \D \pi(y) = I - \frac{1}{\|y\|}I + \frac{1}{\|y\|^3} y y^\top. \] As expected, if \(\|y\| = 1\) (that is, exactly on the sphere), then \(\D h(y) = yy^\top\) has rank \(1\). However, as soon as \(y\) leaves the sphere, the rank of \(\D h(y)\) jumps to \(d\).

Still, one could reasonably ask: if \(M\) is an embedded submanifold of \(\Rd\), is it possible to find a single defining function \(h \colon U \to \Rk\) such that \(M = h^{-1}(0)\) and \(\D h(x)\) has constant rank \(d-r\) for all \(x \in U\)? As noted earlier, it is not always possible to do this with \(k = d-r\) (full rank), because some submanifolds are not orientable. I am not sure if it can be done with some \(k > d-r\) though:

Please e-mail me if you know.

A few more words about equivalence

What have we said so far about the equivalence of the three definitions of embedded submanifolds? With \(M\) a connected subset of \(\Rd\), this is where we stand:

  1. If \(M\) abides by Definition 1 (retract \(\pi\)), then it abides by Definition 2 (diffeomorphisms \(\psi\)): that is the main story above.

  2. If \(M\) abides by Definition 2, then it is clear that it abides by Definition 3 (local defining maps \(h\)): take each of the diffeomorphisms \(\psi \colon U \to V\) and let \(h \colon U \to \Rk\) be defined by \(h(y) = (\psi_{r+1}(y), \ldots, \psi_d(y))\).

  3. If \(M\) abides by Definition 3, we mentioned at a high level that one can build a map \(\pi\) that abides by Definition 1 via metric projection.

Let us be more explicit about that last point, to complete the triangle of implications.

Define (ambitiously for now) \(\pi \colon \Rd \to \Rd\) as \[ \pi(y) := \arg\min_{x \in M} \|x - y\|. \] It is clear that \(\pi(x) = x\) for all \(x \in M\), so all we have to do is reduce the domain of \(\pi\) to some neighborhood of \(M\) such that \(\pi\) is single-valued and smooth on that domain. Upon doing so, we will have verified that \(M\) abides by Definition 1.

Assume \(M\) abides by Definition 3. For an arbitrary \(\bar{x} \in M\), this allows us to summon a neighborhood \(U\) of \(\bar{x}\) in \(\Rd\) and a map \(h \colon U \to \Rk\) such that

  • \(M \cap U = h^{-1}(0) = \{ y \in U : h(y) = 0 \}\), and
  • \(\rank \D h(\bar{x}) = k\). If need be, make \(U\) smaller so that \(\rank \D h(x) = k\) for all \(x \in U\).

Then, up to some details that can be worked out, for \(y\) close enough to \(\bar{x}\), the projection \(\pi(y)\) corresponds to an equality-constrained optimization problem: \[ \min_{x \in U} \frac{1}{2} \|x - y\|^2 \quad \textrm{ s.t. } \quad h(x) = 0. \]

Argue the existence of a closed ball \(B\) centered at \(\bar{x}\) such that (a) \(3B \subset U\), (b) \(M \cap 3B\) is compact, and (c) \(y \in \frac{1}{2}B\) implies \(\pi(y)\) is non-empty and contained in \(2B\) (hence in \(U\)).

From the KKT theorem or Lagrange multipliers or any other equivalent statement, we see that if \(x\) is a minimizer of that problem then there exists \(\lambda \in \Rk\) such that \[ x - y = \D h(x)^\top \lambda. \] Therefore, given \(y\) close to \(\bar{x}\), we find that \(x\) is in \(\pi(y)\) if and only if there exists \(\lambda \in \Rk\) such that \[ F_y(x, \lambda) = \left( x - y - \D h(x)^\top \lambda, h(x) \right) = 0, \] where \(F_y \colon U \times \Rk \to \Rd \times \Rk\) is smooth.

To see why, pick an arbitrary smooth curve \(c\) on \(M\) passing through \(c(0) = x \in M \cap U\), observe that \(h(c(t)) \equiv 0\) so that \(0 = (h \circ c)'(0) = \D h(x)[c'(0)]\), and also that if \(x\) is optimal then, with \(f(x) = \frac{1}{2} \|x - y\|^2\), first-order optimality requires \(0 = (f \circ c)'(0) = \nabla f(x)^\top c'(0)\). It follows that \(\nabla f(x) = x - y\) is orthogonal to all vectors \(c'(0)\), all of which are in \(\ker \D h(x)\). With more work (e.g., using Definition 2), one can show they span the whole kernel (that’s LICQ in optimization), and so \(\nabla f(x)\) must be in the image of \(\D h(x)^\top\).

The zeros of \(F_y\) contain \(\pi(y)\). Notice that \(F_x(x, 0) = 0\). Moreover, it is a simple computation to check that \(\D F_x(x, 0)\) is invertible. From here, it is a matter of applying the Implicit Function Theorem to argue that, for \(y\) in some neighborhood \(U_{\bar{x}}\) of \(\bar{x}\), the equation \(F_y(x, \lambda) = 0\) has a unique solution, and moreover, that solution depends smoothly on \(y\).

This provides that \(\pi\) is well defined and smooth on \(U_{\bar{x}}\). Repeat for each \(\bar{x} \in M\) and take the union of all the neighborhoods \(U_{\bar{x}}\): this provides a whole neighborhood of \(M\) in which \(\pi\) is well defined and smooth.

Building up differential geometry from here

Join me in Part II to build tangent spaces, smooth maps, differentials, local frames, Riemannian metrics, gradients, etc. based on Definition 1.

References

Hirsch, Morris W. 1976. Differential Topology. Vol. 33. Graduate Texts in Mathematics. Springer Science & Business Media.
Lee, J. M. 2012. Introduction to Smooth Manifolds. 2nd ed. Vol. 218. Graduate Texts in Mathematics. Springer-Verlag New York. https://doi.org/10.1007/978-1-4419-9982-5.
Michor, P. W. 2008. Topics in Differential Geometry. Vol. 93. American Mathematical Society.

Citation

BibTeX citation:
@online{boumal2026,
  author = {Boumal, Nicolas},
  title = {A Different Way to Define Embedded Submanifolds: {Part} {I}},
  date = {2026-02-10},
  url = {www.racetothebottom.xyz/posts/smooth-retracts-are-manifolds-part1/},
  langid = {en},
  abstract = {There are several ways to define manifolds, and the
    definition we choose conditions all the work that follows. One
    little known definition may just be the best entry point yet: an
    embedded submanifold of \$\textbackslash Rd\$ is nothing but the
    image of a smooth idempotent map.}
}