A different way to define embedded submanifolds: Part II

Building the basics of differential geometry

manifolds
Author

Nicolas Boumal

Published

February 10, 2026

Abstract
In this second part, we start from the definition “submanifolds of \(\Rd\) are images of smooth idempotent maps” and build up basic concepts from there: tangent bundles, smooth maps, differentials, metrics, gradients, etc. There is surprisingly little friction. Is this the better way to approach differential geometry?

In Part I of this post, we discussed the following definition of embedded submanifolds of \(\Rd\) as a strong contender for “best entry point to discover differential geometry”:

Definition 1 A connected subset \(M \subseteq \Rd\) is an embedded submanifold of \(\Rd\) if there exist an open set \(U \subseteq \Rd\) and a smooth map \(\pi \colon U \to U\) (here called a defining retraction) such that:

  1. \(M = \pi(U)\), and
  2. \(\pi(x) = x\) for all \(x \in M\) (equivalently, \(\pi \circ \pi = \pi\)).

(If \(M\) is not connected, read on: we’ll handle that in Definition 2.)

Say we start from scratch (no prior knowledge of manifolds) and we flat out define embedded submanifolds of \(\Rd\) via Definition 1. How do we proceed from here to build the basics of differential geometry: smooth maps, tangent spaces, vector fields, differentials, Riemannian metrics, etc.?

It’s actually fairly straightforward. I would argue it’s simpler than when using the standard alternatives as a starting point (think charts or local defining functions). Everything below is fully compatible with standard differential geometry, e.g., as in the textbook by Lee (2012).

Examples of manifolds

For good measure, let us first work out a few examples of embedded submanifolds of \(\Rd\) as per Definition 1. To this end, we should go through a list of “famous manifolds” and figure out defining retractions for them.

A big hint to this effect comes from a discussion in Part I: it is known that if \(M\) is indeed an embedded submanifold of a Euclidean space, then \(\pi\) can be chosen to be the metric projection to \(M\): \[ \pi(y) := \arg\min_{x \in M} \|x - y\|^2. \] The fact that this is smooth on a neighborhood of \(M\) could be proven using the Tubular neighborhood theorem for example, but of course we do not have access to that here (we’re only getting started!). Regardless, this is not a concern to us right now: we only want to look at a few examples, so we can just check smoothness “by hand”. Moreover, we do not have to use metric projection: other choices of \(\pi\) are fine too.

Throughout, “\(\Rd\)” is a placeholder for “Euclidean space of finite dimension \(d\)”: we might as well be working in \(\Rnn\) or any other such space.

Open sets: if \(U\) is open in \(\Rd\), then let \(\pi(x) = x\).

Affine spaces: for \(M = \{x \in \Rd : Ax = b\}\), let \(U = \Rd\) and let \(\pi\) be any linear projector to \(M\).

Spheres: for \(M = \{ x \in \Rd : \|x\| = 1 \}\), let \(U = \Rd \backslash \{0\}\) and \(\pi(x) = x/\|x\|\).

Stiefel: for \(M = \{ X \in \Rnp : X^\top X = I_p \}\), let \(U = \{Y \in \Rnp : \det(Y^\top Y) \neq 0 \}\) and either let \(\pi(Y) = Y(Y^\top Y)^{-1/2}\) (the polar factor of \(Y\)) or let \(\pi(Y)\) be the Q-factor of the QR decomposition of \(Y\), obtained by Gram–Schmidt: both are smooth on \(U\).

Orthogonal group: this is Stiefel with \(n = p\). It’s not connected, but see Definition 2.

Rotation group: this is the connected component of the orthogonal group with determinant +1.

Fixed-rank matrices: for \(M = \{ X \in \Rmn : \rank(X) = k \}\), let \(U = \{ Y \in \Rmn : \sigma_k(Y) > \sigma_{k+1}(Y) \}\), where \(\sigma_k(Y)\) is the \(k\)th largest singular value of \(Y\): these are continuous functions, hence \(U\) is indeed open. Then, let \(\pi(Y)\) be the best rank-\(k\) approximation of \(Y\): this is uniquely defined on \(U\) (by Eckart–Young–Mirsky), and it can be argued that \(\pi\) is smooth on that domain.

Grassmann: for \(M = \{ X \in \Rnn : X^2 = X, X = X^\top \textrm{ and } \trace(X) = k \}\). This is the set of orthogonal projectors onto subspaces of dimension \(k\) in \(\Rn\), so that each point \(X\) in \(M\) can be identified with its subspace \(\im X\). Let \(U = \{ Y \in \Rnn : Y = Y^\top \textrm{ and } \lambda_k(Y) > \lambda_{k+1}(Y) \}\), where \(\lambda_k(Y)\) is the \(k\)th largest eigenvalue of \(Y\) (all real since \(Y\) is symmetric). Eigenvalues are continuous hence \(U\) is open in the space of symmetric matrices. Then let \(\pi(Y)\) map the top-\(k\) eigenvalues of \(Y\) to 1 and the other ones to 0. This, too, can be shown to be smooth.

Positive definite matrices: the set \(M = \{ X \in \Rnn : X = X^\top \textrm{ and } X \succ 0 \}\) is a submanifold of the symmetric matrices because it is open in that space (see the first example).

Many more examples can be built by composition:

Products of manifolds are manifolds.

Indeed, if \(M\) is an embedded submanifold of \(\Rd\) with \(\pi \colon U \to U\) and \(N\) is an embedded submanifold of \(\Rn\) with \(\xi \colon V \to V\), then the Cartesian product \(M \times N\) is an embedded submanifold of \(\Rd \times \Rn\) with \(\zeta \colon U \times V \to U \times V\) defined by \(\zeta(x, y) = (\pi(x), \xi(y))\).

Topology

From here on out, let \(M\) be an embedded submanifold of \(\Rd\) as per Definition 1. Accordingly, let \(U \subseteq \Rd\) be a neighborhood of \(M\) together with a smooth idempotent map \(\pi \colon U \to U\) such that \(M = \pi(U)\).

Since \(M\) is a subset of \(\Rd\), it inherits a natural topology:

We endow \(M\) with the subspace topology, that is, a set \(W \subseteq M\) is open if and only if \(W = M \cap \bar{W}\) for some open set \(\bar{W}\) in \(\Rd\).

Of course, \(\pi \colon U \to M\) is continuous with that topology, because \(\pi^{-1}(W) = \pi^{-1}(M \cap \bar{W}) = \pi^{-1}(\bar W)\) is open, due to the fact that \(\pi \colon U \to U\) is continuous.

A neighborhood of a point or subset of \(M\) is an open set of \(M\) that contains it.

Notice that:

Open subsets of manifolds are manifolds. They are called open submanifolds.

Indeed, if \(M\) is an embedded submanifold of \(\Rd\) with \(\pi \colon U \to U\) and \(N\) is an open subset of \(M\), then \(N\) is an embedded submanifold of \(\Rd\) with \(\xi = \pi|_{\pi^{-1}(N)}\).

Tangent spaces and dimension

Moving to smoother concerns, let us consider curves on manifolds:

A smooth curve on \(M\) is a curve \(c \colon \reals \to U\), smooth in the usual sense, such that \(c(t)\) is in \(M\) for all \(t\). The velocity of \(c\) at time \(t\) is \(c'(t)\) (usual derivative).

Geometrically, it makes good sense to define:

The tangent space to \(M\) at \(x \in M\) is the set \(\T_x M\) of velocities of smooth curves on \(M\) as they pass through \(x\).

As it happens:

\(\T_x M = \im \D\pi(x)\).

This is easily argued as follows:

For contrast, when using local defining functions \(h\), showing \(\T_x M = \ker \D h(x)\) requires building curves with the Inverse Function Theorem.
  1. If \(c \colon \reals \to M\) is a smooth curve on \(M\) passing through \(x\) at \(t = 0\), then it does so with a velocity \(c'(0)\) that is in \(\im \D\pi(x)\) because \(\pi(c(t)) = c(t)\) for all \(t\) so that \[ c'(0) = (\pi \circ c)'(0) = \D\pi(c(0))[c'(0)] = \D\pi(x)[c'(0)]. \]

  2. The other way around, for each \(u \in \im \D\pi(x)\) we can build a smooth curve on \(M\) that passes through \(x\) with velocity \(u\): simply let \(c(t) = \pi(x+tu)\) and compute \(c'(0) = \D\pi(x)[u] = u\). To check that last equality, notice that for \(u \in \im \D\pi(x)\) there is some \(v\) such that \(u = \D\pi(x)[v]\), hence \[ \D\pi(x)[u] = \D\pi(x)[\D\pi(x)[v]] = \D\pi(x)[v] = u \] owing to \(\D\pi(x)\) being idempotent (seen by differentiating \(\pi \circ \pi = \pi\) at \(x \in M\)).

It is now clear that each tangent space is a linear subspace of \(\Rd\). Moreover, they all have the same dimension since \(\D\pi\) has constant rank. Accordingly,

The dimension of \(M\) is \(\dim M = \dim \T_x M = \rank\,\D\pi(x)\) (independent of \(x \in M\)).

We can now extend Definition 1 to disconnected sets:

Definition 2 A subset \(M \subseteq \Rd\) is an embedded submanifold of \(\Rd\) if each of its connected components is so, and they all have the same dimension.

In what follows, it generally does not matter whether \(M\) is connected or not.

Smooth maps

Let’s define smooth maps. In addition to \(M\) embedded in \(\Rd\), let \(N\) also be an embedded submanifold of some Euclidean space \(\Rn\). Then, define:

A map \(F \colon M \to N\) is smooth if there exists a map \(\bar{F} \colon \Rd \to \Rn\), smooth in the usual sense on a neighborhood of \(M\), such that \(F = \bar{F}|_{M}\). We call \(\bar{F}\) a smooth extension of \(F\).

This should seem reasonable, in the sense that if \(\bar{F}\) is smooth and we restrict it to a smooth submanifold \(M\), then the resulting map \(F = \bar{F}|_{M}\) should be smooth too.

In practice, smooth maps on manifolds often arise in that form, so that a smooth extension \(\bar{F}\) is readily available. If not, then we can easily build one:

For contrast, when using local defining functions, building a smooth extension requires the Tubular Neighborhood Theorem. Definition 1 has the TNT “baked in”.

If \(F \colon M \to N\) is smooth, then we can build a smooth extension as \(\bar{F} = F \circ \pi\), defined on all of \(U\).

Clearly, for all \(x \in M\) we have \(\bar{F}(x) = F(\pi(x)) = F(x)\), as desired. Also, \(F \circ \pi\) is smooth because, since \(F\) is smooth, by definition there exists a smooth extension \(\hat{F}\) for \(F\) in a neighborhood of \(M\) and \(F \circ \pi = \hat{F} \circ \pi\) is smooth by composition (in the usual way, for maps on open subsets of linear spaces).

It is now a simple exercise to check the following all-important rule:

If \(M, M', M''\) are embedded submanifolds and \(F \colon M \to M'\), \(G \colon M' \to M''\) are smooth maps, then the composition \(G \circ F \colon M \to M''\) is smooth.

As a result:

We can have submanifolds of submanifolds in a natural way.

Indeed, if \(M\) is an embedded submanifold of \(\Rd\) with \(\pi \colon U \to U\) and there is a smooth idempotent map \(\xi \colon V \to V\) on a connected open subset \(V\) of \(M\), then \(N := \xi(V)\) is an embedded submanifold of \(\Rd\) with \(\zeta = (\xi \circ \pi)|_{\pi^{-1}(V)}\) (smooth by composition). We then also call \(N\) an embedded submanifold of \(M\).

Differentials of smooth maps

The whole point of having smooth maps is to differentiate them. The differential of \(F\) at \(x\) along a direction \(v\) tells us how \(F(x)\) varies if we push \(x\) in the direction \(v\). How do we “push” \(x\) “along” \(v\)? With a curve:

  • Let \(F \colon M \to N\) be smooth.
  • Let \(c\) be a smooth curve on \(M\) passing through \(x\) with velocity \(v \in \T_x M\) at time \(t = 0\) (for example, \(c(t) = \pi(x+tv)\)).
  • Then, \(t \mapsto F(c(t))\) is a curve on \(N\) passing through \(F(x)\) at time \(t = 0\).
  • That curve \(F \circ c\) is smooth by composition, hence it has a velocity at time \(t = 0\).
  • By definition, that velocity \((F \circ c)'(0)\) is a tangent vector to \(N\) at \(F(x)\): that is what we call the differential of \(F\) at \(x\) along \(v\).

Succinctly:

The differential of \(F \colon M \to N\) at \(x \in M\) is the map \(\D F(x) \colon \T_x M \to \T_{F(x)} N\) defined as follows: For \(v \in \T_x M\), choose a smooth curve \(c\) on \(M\) with \(c(0) = x\) and \(c'(0) = v\) and let \[\D F(x)[v] = (F \circ c)'(0).\]

Does this depend on the choice of curve \(c\)? No, it does not. In fact, we have the following formula (convenient for computations) which confirms it:

If \(\bar{F}\) is any smooth extension of \(F\), then \(\D F(x) = \D \bar{F}(x)|_{\T_x M}\).

Indeed, \(\D F(x)[v] = (F \circ c)'(0) = (\bar{F} \circ c)'(0) = \D \bar{F}(x)[v]\) because \(c\) lives on \(M\) and \(\bar{F}\) coincides with \(F\) on \(M\). That formula also confirms that \(\D F(x)\) is a linear map, and it makes it a simple exercise to verify the following all-important chain rule:

If \(F, G\) are smooth maps to and from manifolds with valid composition \(G \circ F\) (as above), then \[\D(G \circ F)(x) = \D G(F(x)) \circ \D F(x).\]

(See below for a note about the fact that smoothness is a local property.)

Vector fields and tangent bundles

From here, a natural next step is to define vector fields. To this end, we must first introduce the tangent bundle which is, in a sense, “all tangent spaces together”:

The tangent bundle of \(M\) is the set \(\T M = \{ (x, v) \in U \times \Rd : x \in M \textrm{ and } v \in \T_x M \}\).

An important fact about \(\T M\)$ is that it is a smooth manifold in its own right:

The tangent bundle \(\T M\) is an embedded submanifold of \(\Rd \times \Rd\) with \(\dim(\T M) = 2 \dim M\).

To prove this, we merely need to exhibit a defining retraction. To this end, choose the domain \(\T U := U \times \Rd\) (open in \(\Rd \times \Rd\)) and let \(\xi \colon \T U \to \T U\) be defined as follows: \[ \xi(y, u) = (\pi(y), \D\pi(y)[u]). \] This map is smooth, its image is exactly \(\T M\), and it is indeed the case that \(\xi(y, u) = (y, u)\) if and only if \((y, u)\) is in \(\T M\). Thus, \(\xi\) is a valid defining retraction for \(\T M\), confirming that it is an embedded submanifold of \(\Rd \times \Rd\) as per Definition 1.

To assess the dimension, it suffices to check the dimension of any tangent space. For example, fix \((x, 0) \in \T M\) and compute the differential \[ \D\xi(x, 0)[\dot x, \dot v] = (\D\pi(x)[\dot x], \D\pi(x)[\dot v]). \] Its rank is indeed twice that of \(\D\pi(x)\), as announced.

Now that we know \(\T M\) is a smooth manifold, we can entertain smooth maps to and from \(\T M\). In particular:

A smooth vector field on \(M\) is a smooth map \(V \colon M \to \T M\) such that \(V(x)\) is in \(\T_x M\) for all \(x \in M\).

(Technically, we should require \(V(x) = (x, v)\) for some \(v\), but it is customary to abuse notation in this fashion.)

Retractions to move around on \(M\)

In optimization on manifolds, algorithms require the ability to “move around” on a manifold. They do so by following smooth curves generated by a retraction:

These are not “topological” retractions.

A retraction on \(M\) is a smooth map \(\Retr \colon \T M \to M\) such that, for all \((x, v) \in \T M\), the curve \(c(t) := \Retr(x, tv) = \Retr_x(tv)\) satisfies \(c(0) = x\) and \(c'(0) = v\).

It is clear that \(M\) admits at least one retraction, because:

For contrast, proving the existence of a retraction from local defining functions is quite technical. Having this explicit retraction right now helps several developments below.

One possible retraction is \(\Retr_x(v) := \pi(x+v)\).

Indeed, for \(c(t) := \pi(x + tv)\) with \(v \in \T_x M\), we already noted near the definition of tangent spaces that \(c(0) = x\) and \(c'(0) = v\).

Riemannian metrics

The next important ingredient for optimization is: gradients. To define the gradient of a smooth function \(f \colon M \to \reals\), we first need to endow the tangent spaces of \(M\) with inner products. These should vary smoothly on \(M\), in the following sense.

A Riemannian metric on \(M\) is a collection of inner products \(\inner{\cdot}{\cdot}_x\) (one for each tangent space \(\T_x \calM\)) that vary smoothly with \(x\) in the sense that, for all smooth vector fields \(V, W\) on \(M\), the function \(x \mapsto \inner{V(x)}{W(x)}_x\) is smooth.

When \(M\) is equipped with such a metric, we say \(M\) is a Riemannian manifold.

Since each tangent space is a subspace of \(\Rd\), it is tempting simply to restrict the Euclidean metric of \(\Rd\) to them. This indeed yields a Riemannian metric for \(M\):

The inner products \(\inner{u}{v}_x := u^\top v\) (for \(x \in M\) and \(u, v \in \T_x M\)) form a Riemannian metric for \(M\).

Indeed, if \(V, W\) are smooth, then they admit smooth extensions \(\bar{V}, \bar{W}\) and the function \(g(x) := \inner{V(x)}{W(x)}_x\) is smooth because it admits \(\bar g(x) := \bar{V}(x)^\top \bar{W}(x)\) as a smooth extension.

With that metric, \(M\) is called a Riemannian submanifold of \(\Rd\).

(Let me stress that a Riemannian submanifold of \(\Rd\) is not just a submanifold that has a Riemannian metric: it must that that metric, inherited from \(\Rd\).)

What other Riemannian metrics could there be? You may consider \(\inner{u}{v}_x := u^\top G(x) v\) for any smooth \(G \colon M \to \Rdd\) such that \(G(x)\) is symmetric and positive definite (at least, restricted to \(\T_x M\)) for all \(x\).

Gradients

Now that we have a Riemannian metric, we can define gradients as follows:

The gradient of a smooth function \(f \colon M \to \reals\) at \(x\) is the unique tangent vector \(\grad f(x)\) in \(\T_x M\) such that \[\inner{\grad f(x)}{v}_x = \D f(x)[v] \quad \textrm{for all } v \in \T_x M.\]

Existence and uniqueness of \(\grad f(x)\) is clear: just pick an orthonormal basis \(v_1, \ldots, v_m\) of \(\T_x M\) (with \(m = \dim M\)) and check that \(\grad f(x) = \sum_{i=1}^{m} \D f(x)[v_i] v_i\).

How can we compute the gradient in practice?

Let \(\bar{f}\) be any smooth extension of \(f\). Then, \(\D f(x)[v] = \D \bar{f}(x)[v]\) for all \(v \in \T_x M\) by the formula for differentials above. Thus, the (Riemannian) gradient of \(f\) at \(x\) is related to the (Euclidean) gradient of \(\bar{f}\) at \(x\) as follows: \[ \grad f(x) = \sum_{i=1}^{m} \D \bar{f}(x)[v_i] v_i = \sum_{i=1}^{m} (v_i^\top \grad \bar{f}(x)) v_i. \] Be mindful that \(v_1, \ldots, v_m\) are orthonormal with respect to the inner product \(\inner{\cdot}{\cdot}_x\) on \(\T_x M\), whereas \(u^\top v\) is the Euclidean inner product in \(\Rd\). If these match, then the right-hand side is nothing but the orthogonal projection of \(\grad \bar{f}(x)\) to \(\T_x M\):

If \(M\) is a Riemannian submanifold of \(\Rd\) and \(\bar{f}\) is a smooth extension of \(f \colon M \to \reals\), then \(\grad f(x)\) is the orthogonal projection of \(\grad \bar{f}(x)\) to \(\T_x M\).

To be clear, in that scenario, the projection is orthogonal with respect to the Euclidean inner product of the embedding space. Since \(\T_x M = \im \D\pi(x)\), we simply have:

The orthogonal projector from \(\Rd\) to \(\T_x M\) is \(\Proj_x = \D\pi(x) \circ \D\pi(x)^\dagger\)

where a dagger (\(\dagger\)) denotes the Moore–Penrose pseudoinverse. That formula also makes it clear that \(\Proj_x\) varies smoothly with \(x\), because \(\D\pi(x)\) has constant rank in a neighborhood of \(M\). If \(\pi\) is metric projection to \(M\) with respect to the Euclidean metric of \(\Rd\) (as in many of the examples), then \(\Proj_x = \D\pi(x)\).

Local frames and smoothness of the gradient

Since \(\grad f(x)\) is in \(\T_x M\) for each \(x\), it follows that \(\grad f\) is a vector field. What is more:

If \(f \colon M \to \reals\) is smooth, then \(\grad f\) is a smooth vector field on \(M\).

To argue this properly, we need one more technical tool: local frames.

A local frame around \(x \in M\) is a collection of smooth vector fields \(W_1, \ldots, W_m\) on \(M\) such that, for all \(y\) in some neighborhood of \(x\), the vectors \(W_1(y), \ldots, W_m(y)\) form a basis of \(\T_y M\).

There exists a local frame around each point \(x\) of \(M\).

Existence of a local frames around \(x\) is trivial: using any basis \(v_1, \ldots, v_m\) for \(\T_x M\), simply let \[ W_i(y) = \left. \ddt \pi(y+tv_i) \right|_{t = 0} = \D \pi(y)[v_i]. \] It is clear that each \(W_i\) is smooth because \(y \mapsto \D \pi(y)[v_i]\) is smooth on all of \(U\) (it serves as a smooth extension of \(W_i\)). And of course \(W_i(y)\) is in \(\T_y M\) for all \(y\) because \(\T_y M = \im \D\pi(y)\). Moreover, \(W_i(x) = v_i\), hence \(W_1(x), \ldots, W_m(x)\) form a basis of \(\T_x M\). The dimension of the subspace spanned by \(W_1(y), \ldots, W_m(y)\) cannot suddenly drop as \(y\) moves away from \(x\), and they are always included in \(\T_y M\) whose dimension is \(m\), hence these vectors form a basis of \(\T_y M\) for all \(y\) in some neighborhood of \(x\), as desired.

Global frames may not exist: think of the Hairy Ball Theorem.

If we have a Riemannian metric on \(M\), then:

We may even choose a local frame that is orthonormal in a neighborhood of \(x\).

Indeed, we can further orthonormalize the local frame above, at least locally around \(x\). Indeed, if the vectors \(W_1(y), \ldots, W_m(y)\) are linearly independent, then the Gram–Schmidt algorithm provides real numbers \(c_{ij}(y)\) such that the vectors \(\hat W_i(y) := \sum_{j=1}^m c_{ij}(y) W_j(y)\) for \(i = 1, \ldots, m\) are orthonormal. Moreover, close inspection of the algorithm reveals that each \(c_{ij}\) is smooth on the neighborhood of \(x\) where the \(W_i(y)\) are linearly independent.

That’s right: Gram the Schmidt out of them.

It is an exercise to extend each \(c_{ij}\) to a smooth function on all of \(M\) (for example, by multiplying them with an appropriate transition function applied to the determinant of the Gram matrix of the \(W_i(y)\)). The resulting vector fields \(\hat W_1, \ldots, \hat W_m\) then form an orthonormal local frame around \(x\): \(\innersmall{\hat W_i(y)}{\hat W_j(y)}_y = \delta_{ij}\) for all \(i,j\) and all \(y\) in a neighborhood of \(x\).

\(\delta_{ii} = 1\) and \(\delta_{ij} = 0\) if \(i \neq j\).

Now recall that \(\grad f(x) = \sum_{i=1}^{m} \D f(x)[v_i] v_i\) if \(v_1, \ldots, v_m\) is an orthonormal basis of \(\T_x M\). Accordingly, if \(\hat W_1, \ldots, \hat W_m\) is an orthonormal local frame around \(x\) and \(f \colon M \to \reals\) is smooth, then \(V(y) := \sum_{i=1}^{m} \D f(y)[\hat W_i(y)] \hat W_i(y)\) is a smooth vector field on \(M\), and the vector fields \(V\) and \(\grad f\) coincide on a neighborhood of \(x\).

This shows that \(\grad f\) is smooth around every \(x\). Since smoothness is a local notion (see below), we find that \(\grad f\) is a smooth vector field, as claimed.

Transporters

It is often convenient to move tangent vectors around, in the sense that if \(u\) is in \(\T_x M\) then we may want to choose a vector \(v \in \T_y M\) that “corresponds” to \(u\) ine some sense. There are plenty of ways to do this: an important concept here is that of connections and parallel transports. That said, if we are happy with a loose notion of that concept, then the following is useful (especially for algorithm design)—see (Boumal 2023, sec. 10.5):

A transporter is a smooth map \(T \colon \T M \times M \to \T M \colon ((x, u), y) \mapsto T_{y \leftarrow x}(u)\) such that

  1. For each \(x, y \in M\), the map \(u \mapsto T_{y \leftarrow x}(u)\) is linear from \(\T_x M\) to \(\T_y M\).
  2. For each \((x, u) \in \T M\), we have \(T_{x \leftarrow x}(u) = u\).

In our case, a trivial choice of transporter is as follows:

The map \(T_{y \leftarrow x}(u) := \D\pi(y)[u]\) is a transporter on \(M\).

In particular, we get a sometimes useful fact out of this (which we already used to build local frames above): for any \((x, u) \in \T M\), the vector field \(V(y) := \D\pi(y)[u]\) is smooth on all of \(M\) and it satisfies \(V(x) = u\).

Smoothness is a local notion

The definitions above also make it clear that smoothness is a local property, in the following sense.

First, recall from our topology that if \(W\) is open in \(M\) then \(W\) itself is an embedded submanifold of \(\Rd\). This means that we can make sense of smooth maps defined on open subsets of \(M\), as used here:

\(F \colon M \to N\) is smooth if and only if each \(x \in M\) has a neighborhood \(W \subseteq M\) such that \(F|_W \colon W \to N\) is smooth.

One direction is obvious. The other direction is not so obvious when working with local defining functions, because the smooth extensions on the various neighborhoods \(W\) might not be compatible. But here, we can use the common extension through \(\pi\) quite neatly.

Indeed, if \(F|_W\) is smooth, that means there exists a smooth extension \(\bar{F}_W\) of \(F|_W\) defined on a neighborhood of \(W\) in \(\Rd\). Notice that \(\bar{F}_W \circ \pi|_{\pi^{-1}(W)}\) is smooth by composition of smooth maps on open sets in \(\Rd\). Moreover, it coincides with \(F \circ \pi\) on \(\pi^{-1}(W)\), because for \(y \in \pi^{-1}(W)\) we have \(\pi(y) \in W\) and hence \(\bar{F}_W(\pi(y)) = F(\pi(y))\). By ranging over all \(x \in M\) and the corresponding neighborhoods \(W\), we see that \(F \circ \pi\) is smooth on all of \(U\), and hence \(F\) is smooth.

Second-order geometry

The natural next step from here would be to define connections, covariant derivatives and Riemannian connections. This will have to wait for a (potential) follow-up post.

Let me already note that a natural family of connections is \[ \nabla_u V := \D\pi(x)[A(x) \D V(x)[u]] \] for \(u \in \T_x M\) and \(V\) a smooth vector field on \(M\), where \(A \colon M \to \Rdd\) is a smooth function.

For example, with \(A(x) = \D\pi(x)^\dagger\) (Moore–Penrose pseudoinverse), this is the Riemannian connection of \(M\) provided \(M\) is equipped with the Riemannian submanifold metric inherited from \(\Rd\).

References

Boumal, N. 2023. An Introduction to Optimization on Smooth Manifolds. Cambridge University Press. https://doi.org/10.1017/9781009166164.
Lee, J. M. 2012. Introduction to Smooth Manifolds. 2nd ed. Vol. 218. Graduate Texts in Mathematics. Springer-Verlag New York. https://doi.org/10.1007/978-1-4419-9982-5.

Citation

BibTeX citation:
@online{boumal2026,
  author = {Boumal, Nicolas},
  title = {A Different Way to Define Embedded Submanifolds: {Part} {II}},
  date = {2026-02-10},
  url = {www.racetothebottom.xyz/posts/smooth-retracts-are-manifolds-part2/},
  langid = {en},
  abstract = {In this second part, we start from the definition
    “submanifolds of \$\textbackslash Rd\$ are images of smooth
    idempotent maps” and build up basic concepts from there: tangent
    bundles, smooth maps, differentials, metrics, gradients, etc. There
    is surprisingly little friction. Is this the better way to approach
    differential geometry?}
}