Consider a family of maximization problems, defined by some parameters. For example, \(\Gamma(t)\) is a set that depends on \(t \geq 0\), and we consider \[ \max_{x} f(x) \qquad \textrm{ subject to } \qquad x \in \Gamma(t). \] The Maximum Theorem (Bergé 1963, p116) (see it on google books) is a convenient set of statements about the continuity of the maximal value of that problem as a function of the parameter. In the example, the theorem requires \(\Gamma(t)\) to be compact and to vary continuously “in some sense”.

The full statement (which is purely topological) is technical and more general—see also the excellent Wikipedia article about it. It separates concerns between upper and lower semicontinuity, does not need a metric, and allows both \(f\) and \(\Gamma\) to vary with more than just a real parameter. I have found that the particular case below is often sufficient for my needs—see for example this post about retractions, where \(\Gamma(t)\) is (essentially) a product of two balls of radius \(t\). The proof below is that of Berge, only slightly simplified for this particular case.

**Definition 1** A *correspondence* from \(A\) to \(B\) maps each element of \(A\) to a subset of \(B\).

**Theorem 1 (Maximum Theorem)** Let \(f \colon \calM \to \reals\) be continuous on a metric space \(\calM\). Let \(\Gamma\) be a correspondence from \([0, \infty)\) to \(\calM\) such that

- \(\Gamma(t)\) is a compact subset of \(\calM\) for all \(t \geq 0\);
- For all \(t > 0\), \(x \in \Gamma(t)\) and \(\delta > 0\), there exists \(x'\) and \(\underline{t} < t\) such that \(\dist(x, x') < \delta\) and \(x' \in \Gamma(t')\) for all \(\underline{t} \leq t' \leq t\); and
- For all \(t \geq 0\), if \(\calU\) is a neighborhood of \(\Gamma(t)\) then there exists \(\overline{t} > t\) such that \(\Gamma(t') \subseteq \calU\) for all \(t \leq t' \leq \overline{t}\).

Then, \(F(t) = \max_{x \in \Gamma(t)} f(x)\) is continuous on \([0, \infty)\).

*Proof*. We prove upper and lower semicontinuity separately. For both, pick \(\varepsilon > 0\) and \(t \geq 0\).

If \(t = 0\), let \(\underline{t} = 0\). Now assume \(t > 0\). Choose \(x \in \Gamma(t)\) such that \(F(t) \leq f(x) + \varepsilon\) (we could also pick \(x\) such that \(f(x) = F(t)\) since \(f\) is continuous and \(\Gamma(t)\) is compact, but this is not needed for this part of the proof). Since \(f\) is continuous, there exists \(\delta > 0\) such that \[\begin{align} \dist(x, x') < \delta \qquad \implies \qquad f(x') \geq f(x) - \varepsilon \geq F(t) - 2\varepsilon. \end{align}\] Pick such a point \(x'\) and \(\underline{t} < t\) in order to have \(x' \in \Gamma(t')\) for all \(\underline{t} \leq t' \leq t\). Then, \[\begin{align} F(t') \geq f(x') \geq F(t) - 2\varepsilon, \end{align}\] and this holds for all \(t' \in [\underline{t}, t]\).

Now for the other direction, fix \(t \geq 0\). Notice that for each \(x \in \Gamma(t)\) there exists \(\delta_x > 0\) such that \[\begin{align} \dist(x, x') < \delta_x \qquad \implies \qquad f(x') \leq f(x) + \varepsilon. \end{align}\] This provides an open ball around each \(x \in \Gamma(t)\). Since \(\Gamma(t)\) is compact, we can select a finite number of points \(x_1, \ldots, x_n \in \Gamma(t)\) whose open balls cover \(\Gamma(t)\). The union of these \(n\) balls is an open neighborhood of \(\Gamma(t)\), hence there exists \(\overline{t} > t\) such that for all \(t \leq t' \leq \overline{t}\) the set \(\Gamma(t')\) is included in the union of the balls. Consequently, \[\begin{align} F(t') \leq \max_{i=1,\ldots, n} \left[ \max_{x': \dist(x_i, x') < \delta_{x_i}} f(x') \right] \leq \max_{i=1,\ldots,n} f(x_i) + \varepsilon \leq F(t) + \varepsilon. \end{align}\] This holds for all \(t' \in [t, \overline{t}]\).

Combined, we have shown that \(|F(t) - F(t')| \leq 2\varepsilon\) for all \(t' \in [\underline{t}, \overline{t}]\), i.e., \(F\) is continuous.

A couple of comments about reading Berge’s book:

- There is a blanket assumption that topological spaces are Hausdorff on page 65 (on the page before, “separated” is said to mean “Hausdorff”). I’m not sure if that applies to the rest of that chapter or to the rest of the book, but (in any case) metric spaces are Hausdorff.
- The abbreviation “u.s.c.” which appears in the Maximum Theorem is defined on page 109. Applied to \(\Gamma\), it includes the requirement that each image of \(\Gamma\) is compact.
- The Maximum Theorem on page 116 is a consequence of earlier statements that look at upper and lower semicontinuity separately.

Another reference for such results is the book by Bonnans and Shapiro (2000, Prop. 4.4).

If you are interested in the differential of the maximum value, check out Danskin’s theorem.

## References

*Topological Spaces: Including a Treatment of Multi-Valued Functions, Vector Spaces, and Convexity*. Oliver; Boyd, Ltd.

*Perturbation Analysis of Optimization Problems*. Springer New York. https://doi.org/10.1007/978-1-4612-1394-9.