You are on page 1of 15

17.

Convex Set
Given an affine space (𝐴, 𝑉) where 𝑉 = (𝐹, 𝑉) is a vector space with an ordered field 𝐹 (when discussing
affine combination, the orderedness of 𝐹 is not required), a subset 𝐶 ⊆ 𝐴 is a convex set if for any two
points 𝑥, 𝑦 ∈ 𝐶 and any 0 < 𝜃 < 1, we have 𝜃𝑥 + (1 − 𝜃)𝑦 ∈ 𝐶. Obviously, an affine subspace of (𝐴, 𝑉) Commented [TC1]: Can be 0 ≤ 𝜃 ≤ 1 but it is not
is a convex set. The dimension of a convex set 𝐶 is the dimension of the affine hull of 𝐶. Given a subset necessary, since 𝑥, 𝑦 are already in 𝐶.
of points 𝑆 ⊆ 𝐴, the smallest convex set that contains 𝑆 is called the convex hull of 𝑆, denoted as 𝐶(𝑆).
Given a set of points 𝑆 = {𝑝 , … , 𝑝 }, the (finite) convex combination of 𝑆 is defined as ∑ 𝜆 𝑝 s.t.
∑ 𝜆 = 1 and 𝜆 ≥ 0.

One implication of convex sets for optimization search is that once a search along a direction leaves a
convex set, it never returns. However, for the search might re-enter a non-convex set.

See Theorem 20-5 for an application: the situation of a linear search is reflected by the level set of the
current position, where a worse and bigger level set contains a better and smaller level set. If all level
sets are convex, once a search leaves a better one, it never returns to it in future search, and hence
miss the minimum for sure.

 We first state some basic properties of convex sets.


1) Property 17-1 Like Property 16-9, any intersection of convex sets is a convex set.

2) Property 17-2 The induced vectors set 𝑉 = {0𝑥⃗: 𝑥 ∈ 𝐶} w.r.t. any origin is a convex set, although
it is origin dependent. For any 𝜃 ∈ (0,1) and 0𝑥⃗, 0𝑦⃗ ∈ 𝑉 , then by Property 16-4 we have for any
𝜃 ∈ [0,1]

0 + 𝜃0𝑥⃗ + (1 − 𝜃)0𝑦⃗ = 𝜃𝑥 + (1 − 𝜃)𝑦 ∈ 𝐶 ⇒ 𝜃0𝑥⃗ + (1 − 𝜃)0𝑦⃗ ∈ 𝑉


Conversely, given a convex vector set 𝑉 , then 0 + 𝑉 is convex for any origin. For any 𝑥, 𝑦 ∈ 0 +
𝑉 , 𝑥 = 0 + 𝐮, 𝑦 = 0 + 𝐯 for some 𝐮, 𝐯 ∈ 𝑉 . For any 𝜃 ∈ [0,1],
𝜃𝑥 + (1 − 𝜃)𝑦 = 𝜃0 + 𝜃𝐮 + (1 − 𝜃)0 + (1 − 𝜃)𝐯 = 0 + 𝜃𝐮 + (1 − 𝜃)𝐯
where 𝜃𝐮 + (1 − 𝜃)𝐯 ∈ 𝑉 since 𝑉 is convex, so 0 + 𝜃𝐮 + (1 − 𝜃)𝐯 ∈ 0 + 𝑉 .

3) Property 17-3 Given two convex set 𝐶, recall its negative set −𝐶 ≔ 0 − 0𝑥⃗ : 𝑥 ∈ 𝐶 = 0 − 𝑉 . If
𝐶 is convex, then −𝐶 is convex. For example, given a polyhedron 𝑃: 𝐚 [𝑥] ≤ 𝑏, then it is not hard
to find −𝑃: 𝐚 [𝑥] ≥ 𝑏. For any 𝑥, 𝑦 ∈ 𝐶, 𝜃 ∈ [0,1]

𝜃𝑥 + (1 − 𝜃)𝑦 = 𝜃0 + (1 − 𝜃)0 − 𝜃0𝑥⃗ − (1 − 𝜃)0𝑦⃗ = 0 − 𝜃0𝑥⃗ + (1 − 𝜃)0𝑦⃗

where 𝜃0𝑥⃗ + (1 − 𝜃)0𝑦⃗ ∈ 𝑉 since 0𝑥⃗ , 0𝑦⃗ ∈ 𝑉 and 𝑉 is convex, then 𝜃𝑥 + (1 − 𝜃)𝑦 ∈ 0 −
𝑉 = −𝐶 and we see −𝐶 is convex.
4) Property 17-4 Given two convex set 𝐶 , 𝐶 , recall their sum is

𝐶 + 𝐶 ≔ 0 + 0𝑥 ⃗ + 0𝑥 ⃗, 𝑥 ∈ 𝐶 , 𝑥 ∈ 𝐶 = 0+𝑉 +𝑉
𝐶 + 𝐶 is origin dependent, but it is always convex. First note

𝑉 = 𝑉 + 𝑉 = 0𝑥 ⃗ + 0𝑥 ⃗, 𝑥 ∈ 𝐶 , 𝑥 ∈ 𝐶
Then for any 𝑥, 𝑦 ∈ 𝐶 + 𝐶 , 𝑥 = 0 + 0𝑥 ⃗ + 0𝑥 ⃗ for some 𝑥 ∈ 𝐶 , 𝑥 ∈ 𝐶 , and 𝑦 = 0 + 0𝑦⃗ +
0𝑦 ⃗ for some 𝑦 ∈ 𝐶 , 𝑦 ∈ 𝐶 , and

𝜃𝑥 + (1 − 𝜃)𝑦 = 𝜃0 + (1 − 𝜃)0 + 𝜃0𝑥 ⃗ + (1 − 𝜃)0𝑦⃗ + 𝜃0𝑥 ⃗ + (1 − 𝜃)0𝑦 ⃗


= 0 + 𝜃0𝑥 ⃗ + (1 − 𝜃)0𝑦⃗ + 𝜃0𝑥 ⃗ + (1 − 𝜃)0𝑦 ⃗
Note 𝑉 and 𝑉 are both convex by Property 17-2, thus

𝜃0𝑥 ⃗ + (1 − 𝜃)0𝑦⃗ ∈ 𝑉 , 𝜃0𝑥 ⃗ + (1 − 𝜃)0𝑦 ⃗ ∈ 𝑉


⇒ 𝜃𝑥 + (1 − 𝜃)𝑦 ∈ 0 + 𝑉 + 𝑉 = 𝐶 + 𝐶
where we see 𝐶 + 𝐶 is convex. Together with previous property, we have 𝐶 − 𝐶 = 𝐶 +
(−𝐶 ) and 𝐶 𝐶⃗ = {𝑥 𝑥 ⃗: 𝑥 ∈ 𝐶 , 𝑥 ∈ 𝐶 } are convex if 𝐶 , 𝐶 are convex. Note any point 𝑥 ∈
𝐶 − 𝐶 has the form 𝑥 = 0 + 0𝑥 ⃗ − 0(0 − 0𝑥 )⃗ = 0 + 0𝑥 ⃗ − 0𝑥 ⃗ = 0 + 𝑥 𝑥 ⃗ , then 𝐶 𝐶⃗ =
𝑉 , and it is convex since 𝐶 − 𝐶 is convex.
5) Property 17-5 The closure 𝐶̅ of a convex set 𝐶 is convex w.r.t. any metric (where the metric
induced topology is always assumed). For any two points 𝑥, 𝑦 ∈ 𝐶̅ , there exists two sequences
{𝑥 } ⊆ 𝐶, {𝑦 } ⊆ 𝐶, 𝑥 → 𝑥, 𝑦 → 𝑦, since every point in a closed set is a sequence limit of points Commented [TC2]: Not necessarily limit points, but
in 𝐶 . Note 𝜃𝑥 + (1 − 𝜃)𝑦 ∈ 𝐶 for every 𝑘 , then its limit 𝜃𝑥 + (1 − 𝜃)𝑦 ∈ 𝐶̅ since 𝐶̅ is the indeed sequential limits.
closure.
6) Property 17-6 If 𝐴 is a normed affine space, then the interior int 𝐶 of a convex set 𝐶 is convex w.r.t.
the norm. For any two points 𝑥, 𝑦 ∈ int 𝐶, we want to prove 𝜃𝑥 + (1 − 𝜃)𝑦 ∈ int 𝐶. We can draw
two open balls 𝐵 (𝑥) and 𝐵 (𝑦) with the same sufficiently small radius 𝑟 centered at 𝑥, 𝑦
respectively.
Then for any 𝑧 = 𝜃𝑥 + (1 − 𝜃)𝑦 , we can draw a ball 𝐵 (𝑧) for some 𝑟 < 𝑟 and we claim
𝐵 (𝑧) ⊆ 𝐶 . This is because any point 𝑤 in 𝐵 (𝑧) can be written as 𝑤 = 𝑧 + 𝑤 − 𝑧 where
𝑑(𝑧, 𝑤) = ‖𝑤 − 𝑧‖ < 𝑟, and note
𝑑(𝑥, 𝑥 + 𝑤 − 𝑧) = 𝑑(𝑦, 𝑦 + 𝑤 − 𝑧) = ‖𝑤 − 𝑧‖ < 𝑟
⇒ 𝑥 + 𝑤 − 𝑧 ∈ 𝐵 (𝑥) ⊂ 𝐶, 𝑦 + 𝑤 − 𝑧 ∈ 𝐵 (𝑦) ⊂ 𝐶
Then we have
𝜃(𝑥 + 𝑤 − 𝑧) + (1 − 𝜃)(𝑦 + 𝑤 − 𝑧)
= 𝜃𝑥 + (1 − 𝜃)𝑦 + 𝑤 − 𝑧 = 𝑧 + 𝑤 − 𝑧 = 𝑤
⇒ 𝑤 ∈ 𝐶, ∀𝑤 ∈ 𝐵 (𝑧) ⇒ 𝐵 (𝑧) ⊆ 𝐶 Commented [TC3]: Since 𝑤 = 𝜃(𝑥 + 𝑤 − 𝑧) +
(1 − 𝜃)(𝑦 + 𝑤 − 𝑧) and we have proved 𝑥 + 𝑤 − 𝑧 ∈
since 𝐶 is convex. Thus 𝑧 = 𝜃𝑥 + (1 − 𝜃)𝑦 ∈ int 𝐶. 𝐶, 𝑦 + 𝑤 − 𝑧 ∈ 𝐶.
 We present some basic convex sets in ℝ equipped with a norm-induced metric. It has been
mentioned earlier that for any fixed 𝑏 ∈ 𝐹,
{(𝑥 , … , 𝑥 ) ∈ ℝ : 𝑎 𝑥 + ⋯ + 𝑎 𝑥 = 𝑏} = {𝑥 ∈ ℝ , 𝐚 𝐱 = 𝑏} Commented [TC4]: Again, letter 𝐱 in bold stands for a
vector, while the corresponding non-bold letter 𝑥 stands for
where 𝑥 = (𝑥 , … , 𝑥 ), 𝐚 = (𝑎 , … , 𝑎 ) is a hyperplane and hence a convex set. Note it is not a point.
necessary to let ∑ 𝑎 = 1 due to our earlier remark. A lower open halfspace is defined as
Commented [TC5]: This is because all affine sets are
𝐿𝐚, = {𝑥 ∈ ℝ , 𝐚 𝐱 < 𝑏} convex.

Note this halfspace is the preimages of (−∞, 𝑏) under 𝑓(𝐱) = 𝐚 𝐱. 𝑓(𝐱) is a linear map and hence a
continuous ℝ → ℝ function w.r.t. any norms defined on ℝ and ℝ by EX 23, then the halfspace is an
open set, since the preimages of an open set under a continuous function is also open. Halfspace is a
convex set since 𝐚 (𝜃𝐱 + (1 − 𝜃)𝐲) < 𝜃𝑏 + (1 − 𝜃𝑏) = 𝑏 for any 𝑥, 𝑦 ∈ 𝐻 , 0 ≤ 𝜃 ≤ 1. An example
of such a low open half space is shown below
Figure 17-1 A lower open half space 2𝑥 − 0.5𝑦 + 𝑧 < 2 in ℝ

Similarly, we can define upper open half space 𝑈𝐚, , lower closed half space 𝐿𝐚, , upper closed half
space 𝑈𝐚, as the following, and they are all convex.
𝑈𝐚, = {𝑥 ∈ ℝ , 𝐚 𝐱 > 𝑏}
𝐿𝐚, = {𝑥 ∈ ℝ , 𝐚 𝐱 ≤ 𝑏}
𝑈𝐚, = {𝑥 ∈ ℝ , 𝐚 𝐱 ≥ 𝑏}
Define the (closed) polyhedral 𝑃 = ⋂ 𝐿𝐚 , as a non-empty intersection of finite many lower closed
hyperplanes. Since intersection of convex sets are convex, we know a polyhedral is convex. Halfspaces
and polyhedral can be defined for general affine spaces over field ℝ by technique exactly the same as
in Lemma 16-6, and the definition is affine frame independent.

We should note if a halfspace is “upper” or “lower” depends how we “view” the halfspace. For
example, 𝐚 𝐱 = 𝑏 and −𝐚 𝐱 = −𝑏 stands for the same hyperplane 𝐻, however, {𝑥 ∈ ℝ , 𝐚 𝐱 >
𝑏} is an upper halfspace of 𝐻 in view of “𝐚 𝐱 = 𝑏” a lower halfspace of 𝐻 in view of “−𝐚 𝐱 = −𝑏”.
1.0

A cone 𝐾 in a vector space 𝑉 is defined as 𝐾 = {𝐯 ∈ 𝑉: 𝐯 ∈ 𝐾 ⇒


𝜆𝐯 ∈ 𝐾, ∀𝜆 > 0}; as a special case, a cone in vector space ℝ is
𝐾 = {𝐱 ∈ ℝ : 𝐱 ∈ 𝐾 ⇒ 𝜆𝐱 ∈ 𝐾, ∀𝜆 > 0}. A set 𝐾 in affine space 0.5

(𝐴, 𝑉) with some chosen origin 0 is a cone if 𝐾 = 0 + 𝐾 where


𝐾 is a cone in 𝑉, called the vector cone of 𝐾. Equivalently, Commented [XY6]: Let 𝐾 denote the set in the
1.0 0.5 0.5 1.0
equivalent definition to show.
𝐾 = 𝑝 ∈ 𝐴: 𝑝 ∈ 𝐾 ⇒ 0 + 𝜆0𝑝⃗ ∈ 𝐾, ∀𝜆 > 0 For any 𝑝 ∈ 𝐾, 𝑝 = 0 + 0𝑝⃗, so clearly 0𝑝⃗ ∈ 𝐾 ⇒ 𝜆0𝑝⃗ ∈
𝐾 , ∀𝜆 > 0 ⇒ 0 + 𝜆0𝑝⃗ ∈ 0 + 𝐾 = 𝐾. Thus, any 𝑝 ∈ 𝐾
where the vector cone is 𝐾 = 0𝑝⃗: 𝑝 ∈ 𝐾 . By this definition, a 0.5

satisfies the condition in the second definition.


cone needs not be convex, might not contain the origin, and Conversely, let 𝐾 be the induced vector set of 𝐾 . Any 𝑝 ∈
might be neither an open nor a closed set, as shown on the right, 1.0 𝐾 satisfies 𝑝 ∈ 𝐾 ⇒ 0 + 𝜆0𝑝⃗ ∈ 𝐾 , thus 0𝑝⃗ must satisfy 0𝑝⃗ ∈
which is a union of 𝑦 = 𝑥 and 𝑦 = −𝑥 with the origin removed. Figure 17-2 A simple example of cone 0𝐾⃗ ⇒ 𝜆0𝑝⃗ ∈ 0𝐾⃗, this means any induced vector in 0𝐾⃗
However, the closure of a cone must contain the origin, since 0 that does not contain the origin. Its satisfies the condition of a vector cone. 𝐾 = 0 + 0𝐾⃗ where
is the limit point of sequence {𝑝 } ≔ 0 + 0𝑝⃗ as 𝑛 → ∞ for closure contains the origin. 0𝐾⃗ is a vector cone, satisfying the first definition.
any 𝑝 ∈ 𝐾. The closure of the graph on the right is simply a union of 𝑦 = 𝑥 and 𝑦 = −𝑥. A convex cone
in a vector space is defined as
𝐾 = {𝐯 ∈ 𝑉: 𝐮, 𝐯 ∈ 𝐾 ⇒ 𝜆𝐮 + 𝜇𝐯 ∈ 𝐾, ∀𝜆, 𝜇 > 0} Commented [TC7]: This is not a convex sum and there is
no requirement like “𝜆 + 𝜇 = 1”.
and a set 𝐾 in 𝐴 is a convex cone if 𝐾 = 0 + 𝐾 where 𝐾 is a convex cone in 𝑉, or equivalently

𝐾 = {𝑝 ∈ 𝐴: 𝑝, 𝑞 ∈ 𝐾 ⇒ 0 + 𝜆0𝑝⃗ + 𝜇0𝑞⃗ ∈ 𝐾, ∀𝜆, 𝜇 > 0}


A convex cone is a cone. 𝐯 ∈ 𝐾 ⇒ 𝜆𝐯 = 𝐯 + 𝐯 ∈ 𝐾 , ∀𝜆 > 0, implying 𝐾 is a cone, and so does 𝐾.
A cone is convex iff it is a convex cone. The definition of convex cone does not directly use convexity,
so we need to show this. Necessity. A cone 𝐾 being convex implies
𝜆 𝜇
𝐮, 𝐯 ∈ 𝐾 ⇒ 𝜆𝐮 + 𝜇𝐯 = (𝜆 + 𝜇) 𝐮+ 𝐯 ∈𝐾
𝜆+𝜇 𝜆+𝜇
for any 𝜆, 𝜇 > 0, so it is a convex cone. Sufficiency. The definition of convex cone says 𝐮, 𝐯 ∈ 𝐾 ⇒
𝜃𝐮 + (1 − 𝜃)𝐯 ∈ 𝐾 for any 0 < 𝜃 < 1, so a convex cone is indeed convex. Then same claim holds for
𝐾 since 𝐾 = 0 + 𝐾 is convex cone iff 𝐾 is: if 𝐾 is convex cone, then by definition 𝐾 is convex cone
and hence convex, then 𝐾 is convex; if 𝐾 is convex, then 𝐾 is convex and hence a convex cone, then
𝐾 is a convex cone.
Examples of convex cones in ℝ include
1) Any vector subspace of ℝ is a convex cone, which immediately follows from the definition. That
is, a line, a plane, a hyperplane are all special cases of convex cones.
2) A polyhedral cone is defined as a special type of polyhdron 𝐾 = ⋂ 𝐿𝐚 , . We can verify that
this is a convex cone by
𝐚 𝐱 ≤ 0 ⇒ 𝐚 (𝜆𝐱 + 𝜇𝐲) ≤ 0, ∀𝐱, 𝐲 ∈ 𝐾 , 𝜆, 𝜇 > 0
3) A norm cone is defined as 𝐾‖⋅‖ = {(𝐱, 𝑦) ∈ ℝ : 𝐱 ∈ ℝ , 𝑦 ≥ 0, ‖𝐱‖ ≤ 𝑦}. It is a convex cone
since for any 𝜆, 𝜇 > 0 we have (𝐱 , 𝑦 ), (𝐱 , 𝑦 ) ∈ 𝐾‖⋅‖ ⇒ 𝜆(𝐱 , 𝑦 ) + 𝜇(𝐱 , 𝑦 ) satisfies
‖𝜆𝐱 + 𝜇𝐱 ‖ ≤ 𝜆‖𝐱 ‖ + 𝜇‖𝐱 ‖ ≤ 𝜆𝑦 + 𝜇𝑦
and hence 𝜆(𝐱 , 𝑦 ) + 𝜇(𝐱 , 𝑦 ) ∈ 𝐾‖⋅‖ .

1-norm cone |𝑥| + |𝑦| ≤ 𝑧, 𝑧 ≥ 0 2-norm cone |𝑥| + |𝑦| ≤ 𝑧, 𝑧 ≥ 0


3-norm cone |𝑥| + |𝑦| ≤ 𝑧, 𝑧 ≥ 0 max-norm cone 𝑚𝑎𝑥(|𝑥|, |𝑦|) ≤ 𝑧, 𝑧 ≥ 0
Figure 17-3 Examples of norm cones

 Lemma 17-1 A subset 𝐶 of 𝐴 is convex iff the convex combination of any finite many points
𝑝 , … , 𝑝 ∈ 𝐶 is still in 𝐶. Proof is the same as Lemma 16-1. Necessity is trivial; the closedness of
convex combination immediately implies 𝜃𝑥 + (1 − 𝜃)𝑦 ∈ 𝐶 for any 𝑥, 𝑦 ∈ 𝐶 . Sufficiency by
showing that every convex combination can be computed as the convex combination of two
points at a time,

𝜆
𝜆 𝑝 = 𝜆 𝑝 + (1 − 𝜆 ) 𝑝
1−𝜆

where ∑ 𝑝 is another convex combination since ∑ = = 1 and ≥ 0.

A corollary of this lemma is that for any subset 𝑆 ⊆ 𝐴, its convex hull 𝐶(𝑆) is the set of all possible
finite convex combinations, i.e.

𝐶(𝑆) = 𝜆 𝑝 is a finite convex combination: {𝑝 } ∈ ⊆ 𝑆, |𝐼| < +∞


Obviously, if we take out any element from above-defined 𝐶(𝑆), it will become a non-convex set
by above lemma.
 Property 17-7 With the same proof as Property 16-26, given another affine space 𝐵 and any affine
map 𝒜: 𝐴 → 𝐵, then the image of any convex set 𝐶 of 𝐴 under 𝒜, i.e. 𝒜(𝐶) is convex. As a special
case, ℒ(𝐶) is convex for any linear map ℒ. Conversely, given any convex set 𝐷 in 𝐵, the preimage of 𝐷,
denoted by 𝒜 (𝐷), is also convex. Like the counterexample in Property 16-27, given an arbitrary set
𝑆 in 𝐴, 𝒜(𝑆) being a convex set does not imply 𝑆 is convex.
 Property 17-8 Exchangeability of affine map and convex hull operator. For any 𝑆 ⊆ 𝐴, 𝒜(𝐶(𝑆)) =
𝐶 𝒜(𝑆) , analogous to Property 16-27. For any finite many points 𝑝 , … , 𝑝 in 𝑆, we have 𝒜(𝑝 ) ∈
𝒜(𝑆) and by Lemma 17-1 ∑ 𝜆 𝑝 ∈ 𝐶(𝑆) for any convex combination {𝜆 } s.t. 𝜆 ≥ 0, ∑ 𝜆 = 1,
then

𝒜 𝜆𝑝 = 𝜆 𝒜(𝑝 ) ∈ 𝐶 𝒜(𝑆)
Conversely, for any 𝒜(𝑝 ), … , 𝒜(𝑝 ) ∈ 𝒜(𝑆) with 𝑝 , … , 𝑝 as corresponding preimages in 𝑆 (each
𝑝 might have multiple preimage, but just choose one), then for any {𝜆 } s.t. 𝜆 ≥ 0, ∑ 𝜆 = 1 we
have ∑ 𝜆 𝒜(𝑝 ) ∈ 𝐶(𝒜(𝑆)) and

𝜆 𝒜(𝑝 ) = 𝒜 𝜆𝑝 ∈ 𝒜(𝐶(𝑆))

As a special case, for any 𝑆 ⊂ 𝐴, if 𝒜(𝑆) is a convex set in 𝐵, then 𝒜 𝐶(𝑆) = 𝒜(𝑆), simply due
to 𝒜(𝐶(𝑆)) = 𝐶 𝒜(𝑆) = 𝒜(𝑆).

As a corollary, for any point 𝑎 ∈ 𝑆, we have 𝐶 𝑎𝑆⃗ = 𝑎 𝐶(𝑆) ⃗. We define affine function 𝒜: 𝐴 → 𝑉
by 𝒜(𝑝) = 𝑎𝑝⃗, ∀𝑝 ∈ 𝐴. Thus, 𝐶 𝑎𝑆⃗ = 𝐶 𝒜(𝑆) , and 𝑎 𝐶(𝑆) ⃗ = 𝒜(𝐶(𝑆)), and the claim follows
immediately from above property.
 Lemma 17-2 Given a set of points 𝑆 = {𝑝 , … , 𝑝 }, 𝐶(𝑆) is the set of all convex combinations of 𝑆.
If 𝑆 is an affine independent set of size 𝑛, then 𝐶(𝑆) is of dimension 𝑛 − 1 and named an 𝑛-
simplex, denoted as Δ . Proof is the same as Theorem 16-3. By Lemma 17-1, a convex set
containing 𝑝 , … , 𝑝 cannot be smaller, otherwise at least one convex combination of 𝑝 , … , 𝑝 is
not in 𝐶(𝑆). We only need to show that 𝐶(𝑆) is a convex set. Let 𝑥, 𝑦 be any two points in 𝐶(𝑆),
then for any 0 ≤ 𝜃 ≤ 1 we have

𝜃𝑥 + (1 − 𝜃)𝑦 = 𝜃 𝜆 𝑝 + (1 − 𝜃) 𝜇𝑝 = (𝜃𝜆 + (1 − 𝜃)𝜇 )𝑝

where ∑ 𝜆 =∑ 𝜇 = 1 and 𝜆 ≥ 0, 𝜇 ≥ 0, then

(𝜃𝜆 + (1 − 𝜃)𝜇 ) = 𝜃𝜆 + (1 − 𝜃)𝜇

=𝜃 𝜆 + (1 − 𝜃) 𝜇 = 𝜃+1−𝜃 = 1

and 𝜃𝜆 + (1 − 𝜃)𝜇 ≥ 0. Thus 𝐶(𝑆) is a convex set.


 Lemma 17-3 𝐴 𝐶(𝑆) = 𝐴(𝑆) ,where 𝐴(⋅) denotes affine hull. Note a maximum affine
independent set 𝒮 in 𝑆 is also a maximum affine independent set in 𝐶(𝑆). This is first by Lemma
17-1 that every point in 𝐶(𝑆) is an affine combination of finite many points in 𝑆, and hence an
affine combination of 𝒮, and then by Property 16-14 𝒮 is the maximum affine independent set of
𝐶(𝑆). Then by Property 16-16, both 𝐴 𝐶(𝑆) and 𝐴(𝑆) is a span of 𝒮, thus 𝐴 𝐶(𝑆) = 𝐴(𝑆). Commented [TC8]: If there exists an affine independent
set 𝒮 ⊆ 𝐶(𝑆) s.t. |𝒮 | > |𝒮|, then since every point in 𝒮 is
a affine combination of 𝒮, we have 𝒮 is actually affine
dependent by Property 50, and hence the contradiction.
0.0

0.5

1.0

1.5

2.0
2.0

1.5
affine hull

1.0
convex hull

0.5

0.0

0.5

1.0

1.5

2.0

Figure 17-4 An example of affine hull and convex hull of three points in ℝ

 Define the convex hull of two different points 𝑥, 𝑦 ∈ 𝐶 as a line segment, denoted as [𝑥, 𝑦], i.e.
[𝑥, 𝑦] = {𝜃𝑥 + (1 − 𝜃)𝑦: 𝑥, 𝑦 ∈ 𝐶, 𝜃 ∈ [0,1]}. The definition of convex set can be hence re-written as:
a set 𝐶 is convex if for any two points 𝑥, 𝑦 ∈ 𝐶 we have [𝑥, 𝑦] ⊆ 𝐶. Define the relative interior (a more
general concept with the same name will be defined later) of [𝑥, 𝑦] as (𝑥, 𝑦) = {𝜃𝑥 + (1 − 𝜃)𝑦: 𝑥, 𝑦 ∈
𝐶, 𝜃 ∈ (0,1)]} , i.e. (𝑥, 𝑦) = [𝑥, 𝑦]\{𝑥, 𝑦} . Similarly, define [𝑥, 𝑦) = [𝑥, 𝑦]\{𝑦}, (𝑥, 𝑦] = [𝑥, 𝑦]\{𝑥} .
Note the line segment here is an abstract concept – a convex hull of two points, or a Δ – without any
additional machinery like metric.
Lemma 17-4 Given a point 𝑝 ∈ 𝐶 and a line segment [𝑎, 𝑏] s.t. 𝑝 ∈ (𝑎, 𝑏) , then [𝑎, 𝑏] =
(𝑎, 𝑝)⋃(𝑝, 𝑏)⋃{𝑎, 𝑏, 𝑝} = [𝑎, 𝑝]⋃[𝑝, 𝑏] . For any 𝑞 ∈ 𝐶\{𝑎, 𝑏, 𝑝}, we have 𝑞 = 𝜃𝑎 + (1 − 𝜃)𝑏 for
some 0 < 𝜃 < 1, also
1 1 𝜆−1
𝑎 = (𝑝 − (1 − 𝜆)𝑏) = 𝑝 + 𝑏
𝑝 = 𝜆𝑎 + (1 − 𝜆)𝑏 ⇒ 𝜆 𝜆 𝜆
1 1 𝜆
𝑏= (𝑝 − 𝜆𝑎) = 𝑝+ 𝑎
1−𝜆 1−𝜆 𝜆−1
for some 0 < 𝜆 < 1, 𝜆 ≠ 𝜃. Rewrite 𝑞 as affine combination of 𝑎, 𝑝 and 𝑝, 𝑏 respectively and we have
1 𝜆 (1 − 𝜃)𝜆 1−𝜃 𝜃−𝜆 1−𝜃
𝑞 = 𝜃𝑎 + (1 − 𝜃) 𝑝+ 𝑎 = 𝜃+ 𝑎+ 𝑝= 𝑎+ 𝑝
1−𝜆 𝜆−1 𝜆−1 1−𝜆 1−𝜆 1−𝜆
1 𝜆−1 𝜃 𝜃(𝜆 − 1) 𝜃 𝜆−𝜃
𝑞=𝜃 𝑝+ 𝑏 + (1 − 𝜃)𝑏 = 𝑝 + + (1 − 𝜃) 𝑏 = 𝑝 + 𝑏
𝜆 𝜆 𝜆 𝜆 𝜆 𝜆

Note either 𝜆 > 𝜃 or 𝜆 < 𝜃. When 𝜆 > 𝜃 we find 𝑞 is a convex combination of 𝑝, 𝑏 and hence 𝑞 ∈
(𝑝, 𝑏); when 𝜆 < 𝜃, we find 𝑞 is a convex combination of 𝑎, 𝑝 and hence 𝑞 ∈ (𝑎, 𝑝). It is easy to further
verify that 𝜆 < 𝜃 iff 𝑞 ∈ [𝑎, 𝑝) and 𝜆 > 𝜃 iff 𝑞 ∈ (𝑝, 𝑏] (note 𝑞 = 𝑎 ⇒ 𝜃 = 1 ⇒ 𝜆 < 𝜃, and 𝑞 = 𝑏 ⇒ Commented [TC9]: 𝑞 is nearer to 𝑎 than 𝑝, so its
𝜃 = 0 ⇒ 𝜆 > 𝜃). “weight” 𝜃 on 𝑎 should be larger than 𝑝’s “weight” 𝜆.

Lemma 17-5 If 𝑝 ∈ (𝑎, 𝑏), then any 𝑥 ∈ [𝑎, 𝑝), 𝑦 ∈ (𝑝, 𝑏] we have 𝑝 ∈ (𝑥, 𝑦). By Property 16-5 of
affine combination, we can solve the following as if it were a normal equation group,
(𝜃 − 1)𝑥 + (1 − 𝜃 )𝑦
𝑥 = 𝜃 𝑎 + (1 − 𝜃 )𝑏 ⎧ ⎪𝑎 = 𝜃 −𝜃
𝑦 = 𝜃 𝑎 + (1 − 𝜃 )𝑏 ⇒
⎨ 𝜃 𝑥−𝜃 𝑦
𝜃 ≠𝜃 ⎪𝑏 = 𝜃 − 𝜃

in which 𝑎, 𝑏 are both expressed by 𝑥, 𝑦 by valid affine combinations, then
(𝜃 − 1)𝑥 + (1 − 𝜃 )𝑦 𝜃 𝑥−𝜃 𝑦
𝑝 = 𝜆𝑎 + (1 − 𝜆)𝑏 = 𝜆 + (1 − 𝜆)
𝜃 −𝜃 𝜃 −𝜃
𝜆(𝜃 − 1) + (1 − 𝜆)𝜃 𝜆(1 − 𝜃 ) − (1 − 𝜆)𝜃
= 𝑥+ 𝑦
𝜃 −𝜃 𝜃 −𝜃
𝜃 −𝜆 𝜆−𝜃
= 𝑥+ 𝑦
𝜃 −𝜃 𝜃 −𝜃

By previous lemma, 𝑥 ∈ [𝑎, 𝑝), 𝑦 ∈ (𝑝, 𝑏] ⇒ 𝜃 < 𝜆 < 𝜃 ⇒ > 0, > 0 ⇒ 𝑝 ∈ (𝑥, 𝑦).

 Theorem 17-1 Carathéodory’s theorem. For any subset 𝑆 ⊆ 𝐴, if dim 𝐶(𝑆) = 𝑛, then 𝐶(𝑆) can be
written as the following,

Commented [TC10]: For affine hull, the choice of


𝐶(𝑆) = 𝜆 𝑝 is a convex combination: {𝑝 , … , 𝑝 } ⊆ 𝑆 are some affine independent sets maximum affine independent set is arbitrary, but for convex
hull, the choice has to be a particular set.

To be clearer, any point 𝑥 ∈ 𝐶(𝑆) can be represented by a convex


combination of 𝑛 + 1 affine independent points {𝑝 , … , 𝑝 } ∈ 𝑆,
where 𝑛 is the dimension of the convex hull, and the subscript “𝑥”
in {𝑝 , … , 𝑝 } means the choice of the affine independent set is
dependent on 𝑥 . For example, on a simplex, every point can be
expressed as a convex combination of the same set of affine
independent set; however, in a rectangle in ℝ , as shown on the
right, a point 𝑥 in the lower part of the rectangle is some convex
combination of 𝑝 , 𝑝 and 𝑝 but cannot be written as a convex
combination of 𝑝 , 𝑝 , 𝑝 ; and a point 𝑥 in the upper part of the
rectangle is some convex combination of 𝑝 , 𝑝 , 𝑝 , but cannot be
written as a convex combination of 𝑝 , 𝑝 , 𝑝 . Of course, the choice is not unique, 𝑥 can also be
written as some convex combination of 𝑝 , 𝑝 , 𝑝 , 𝑥 can also be written as some convex combination
of 𝑝 , 𝑝 , 𝑝 . Similar problem for affine hull is much easier and has been addressed in Property 16-16.
In contrast, all points in the affine hull can be an affine combination of one set of affine independent
points in 𝑆, although the that affine independent set can be an arbitrary one in 𝑆.
Equivalently and more concisely, every point in 𝐶(𝑆) is a convex combination of no more than 𝑛 + 1
points in 𝑆, since we can put zero coefficients for some points in the affine independent set.
First note dim 𝐶(𝑆) = 𝑛 indicates the maximum affine independent set {𝑝 , … , 𝑝 } in 𝑆 is of size
𝑛 + 1. We prove by contradiction, assuming that there exists a point 𝑎 in the convex hull that can only
be expressed as a convex combination of at least 𝑛 + 2 points, then 𝑎 = ∑ 𝜆 𝑝 where 𝑚 ≥ 𝑛 + 2
and 𝜆 ≠ 0 for every 𝑖 = 1, … , 𝑚. We want to show that 𝑎 can actually be expressed as 𝑚 − 1 points
of {𝑝 , … , 𝑝 }. Notice 𝑝 , … , 𝑝 must be affine dependent since dim 𝐶(𝑆) = 𝑛 and 𝑚 is at least 𝑛 +
2, then by Theorem 16-10, choosing any origin 𝑜 ∈ 𝑆, there exists another set of coefficients {𝜇 } s.t.
∑ 𝜇 𝑜𝑝⃗ = 𝟎 with ∑ 𝜇 =0
where 𝜇 ≡ 0 and {𝜇 } must contain some positive values, some negative values, and could contain
zeros. Then by Property 16-4 ∑ 𝜆 𝑝 = 𝑜 + ∑ 𝜆 𝑜𝑝⃗, and we have

𝑎= 𝜆𝑝 =𝑜+ 𝜆 𝑜𝑝⃗ + 𝟎 = 𝑜 + 𝜆 𝑜𝑝⃗ + 𝛼 𝜇 𝑜𝑝⃗ = 𝑜 + (𝜆 + 𝛼𝜇 )𝑜𝑝⃗


The key of above equation is the multiplier 𝛼. Note this is a convex combination regardless of 𝛼

(𝜆 + 𝛼𝜇 ) = 𝜆 + 𝛼𝜇 = 𝜆 +𝛼 𝜇 = 𝜆 +𝛼 ×0 = 1

Then the objective is to find some 𝛼 s.t. at least one of 𝜆 + 𝛼𝜇 is zero but 𝜆 + 𝛼𝜇 ≥ 0 for 𝑖 =
1, … , 𝑚 so that ∑ (𝜆 + 𝛼𝜇 )𝑝 is still a convex combination but has only 𝑚 − 1 effective terms with
non-zero coefficients. Such 𝛼 can be found by
𝜆∗ ∗ 𝜆
𝛼=− , 𝑖 = max − :𝜇 > 0
𝜇∗ ,…, 𝜇

Note − : 𝜇 > 0 is non-empty since there must be some positive 𝜇 s as mentioned above. Clearly
𝛼 < 0 and 𝜆 ∗ + 𝛼𝜇 ∗ = 0. Note 𝛼 < 0, thus for 𝜇 ≤ 0, 𝜆 + 𝛼𝜇 ≥ 0 holds; for 𝜇 > 0,
𝜆 𝜆∗ 𝜆 𝜆∗ 𝜆∗ 𝜆∗
− ≤− ⇒ ≥ ⇒𝜆 ≥ 𝜇 ⇒𝜆 − 𝜇 ≥0
𝜇 𝜇∗ 𝜇 𝜇∗ 𝜇∗ 𝜇∗
This implies 𝑎 can be expressed as a convex combination of {𝑝 , … , 𝑝 }\{𝑝 ∗ }, which completes the
proof.
Given an affine space (𝐴, 𝑉), if we define a metric 𝑑 on 𝐴, we have a metric-induced topology on 𝐴. Recall
that given a set of points 𝑆 ⊆ 𝐴 and a distance metric 𝑑, a point 𝑥 in 𝑆 is an interior point if there exists
𝜖 > 0 s.t. 𝐵 (𝑥), which an open ball centered at 𝑥 with radius 𝜖, is a subset of 𝑆.
However, this definition leads to a problem – whether a point is interior depends on the dimension. For
example, every point in an open 2-d ball {(𝑥, 𝑦): 𝑥 + 𝑦 < 1} is interior in ℝ , however they are no
interior in ℝ . Thus, we define the concept of relative interior: 𝑥 is a relative interior point if there exists
𝜖 > 0 s.t. 𝐵 (𝑥)⋂𝐴(𝑆) ⊆ 𝑆, where 𝐴(𝑆) is the affine hull of 𝑆. Thus, for example, every point in an open
2-d ball is relative interior in ℝ . Define the relative interior of 𝑆 as the set of all relative interior points
of 𝑆, denoted as ri 𝑆, and 𝑆 is called relative open if ri 𝑆 = 𝑆. Recall the closure of 𝑆 is 𝑆 union all its limit
points, denoted as 𝑆̅, and 𝐴(𝑆) = 𝐴(𝑆̅) by Property 16-23. The interior of 𝑆 is denoted as int 𝑆. Relative
interior and relative open sets can be viewed as extension of the concept of interior and open sets,
because they equate the latter two if 𝐴(𝑆) is the whole space 𝐴. We focus our discussion on a convex set
𝐶 of 𝐴, and we assume a normed affine space, an affine space whose associated vector space is normed.
 Property 17-9 Let 𝒜 be an affine map, then 𝒜(𝑆̅) ⊆ 𝒜(𝑆) for any set 𝑆 (not just convex set). For any
𝑥 ∈ 𝑆̅ , 𝒜(𝑥) ∈ 𝒜(𝑆̅) , and suppose 𝑥 → 𝑥 where 𝑥 ∈ 𝑆, then 𝒜(𝑥 ) → 𝒜(𝑥) by Property 16-35, Commented [TC11]: Affine map is continuous and
thus we have 𝒜(𝑥) is a limit point of {𝒜(𝑥 )} ⊂ 𝒜(𝑆) and 𝒜(𝑥) ∈ 𝒜(𝑆). bounded w.r.t. any norms defined on the affine spaces.
Further, if 𝑆̅ is compact, then 𝒜(𝑆̅) = 𝒜(𝑆). Recall from real
analysis that any sequence in a compact set of a metric-induced
topological space have a convergent subsequence. For any 𝑦 ∈
𝒜(𝑆) , it is a limit of a sequence {𝒜(𝑥 )} ⊂ 𝒜(𝑆) . If 𝑆̅ is
𝑆 = 𝑆̅ compact, then {𝑥 } has a subsequence converging to some limit
1 𝑥 ∈ 𝑆̅ , and {𝒜(𝑥 )} has a subsequence convergent to 𝒜(𝑥)
= (𝑥, 𝑦): 𝑥 ∈ (0,1], 𝑦 ≥
𝑥 since affine map is continuous by Property 16-35. By
uniqueness of limit, 𝑦 = 𝒜(𝑥) ∈ 𝒜(𝑆̅) , implying 𝒜(𝑆) ⊆
𝒜(𝑆̅).
We present a counterexample for a non-compact set 𝑆 that the
equality does not hold. On ℝ , consider 𝒜: ℝ → ℝ defined by
𝒜(𝑆̅) = (0,1] 𝒜(𝑥, 𝑦) = 𝑥 , and let 𝑆 = (𝑥, 𝑦): 𝑥 ∈ (0,1], 𝑦 ≥ , which is
actually the epigraph of 𝑓(𝑥) = , 𝑥 ∈ (0,1] to be introduced
later, then 𝑆̅ = 𝑆 , so 𝒜(𝑆̅) = 𝒜(𝑆) = (0,1] , but 𝒜(𝑆) =
[0,1] ⊃ 𝒜(𝑆̅).

 Property 17-10 𝑆 × 𝑆 = 𝑆 × 𝑆 w.r.t. any norm defined on the product space. For any (𝑥, 𝑦) ∈
𝑆 × 𝑆 , let (𝑥 , 𝑦 ) → (𝑥, 𝑦) where {(𝑥 , 𝑦 )} ⊂ 𝑆 × 𝑆 , then {𝑥 } ⊂ 𝑆 , {𝑦 } ⊂ 𝑆 and by
𝑥 → 𝑥, 𝑦 → 𝑦 and thus 𝑥 ∈ 𝑆 , 𝑦 ∈ 𝑆 ⇒ (𝑥, 𝑦) ∈ 𝑆 × 𝑆 . Conversely, for any (𝑥, 𝑦) ∈ 𝑆 × 𝑆 , let
𝑥 → 𝑥, 𝑦 → 𝑦 where {𝑥 } ⊂ 𝑆 , {𝑦 } ⊂ 𝑆 , and thus (𝑥 , 𝑦 ) → (𝑥, 𝑦); since (𝑥 , 𝑦 ) ∈ 𝑆 × 𝑆 for
every 𝑘, we have (𝑥, 𝑦) ∈ 𝑆 × 𝑆 .
 Property 17-11 𝑆 + 𝑆 ⊆ 𝑆 + 𝑆 for any 𝑆 , 𝑆 ⊆ 𝔸 . We use 𝑆 × 𝑆 = 𝑆 × 𝑆 for any two sets
𝑆 , 𝑆 , and use Property 17-9. Also for any 𝑆 , 𝑆 ⊆ 𝔸 . Define linear map ℒ: 𝐹 → 𝐹 as ℒ(𝑥, 𝑦) =
𝑥 + 𝑦 for any (𝑥, 𝑦) ∈ 𝐹 , then it is easy to verify ℒ(𝑆 × 𝑆 ) = 𝑆 + 𝑆 . Check that
𝑆 + 𝑆 = ℒ(𝑆 × 𝑆 ) = ℒ(𝑆 × 𝑆 ) ⊆ ℒ(𝑆 × 𝑆 ) = 𝑆 + 𝑆 Commented [TC12]: This is application of Property 73.

If at least one of 𝑆 , 𝑆 are bounded, then 𝑆 + 𝑆 = 𝑆 + 𝑆 . WLOG suppose 𝑆 is bounded, then for any 𝑧 ∈
𝑆 + 𝑆 , there exists 𝑥 + 𝑦 → 𝑧 where {𝑥 } ⊂ 𝑆 , {𝑦 } ⊂ 𝑆 . Since 𝑆 is bounded, then {𝑥 } is bounded, and
hence {𝑦 } is bounded, and (𝑥 , 𝑦 ) has a convergent subsequence 𝑥 , 𝑦 → (𝑥, 𝑦) for some 𝑥 ∈ 𝑆 , 𝑦 ∈ Commented [TC13]: {𝑥 + 𝑦 } is bounded since it is
convergent, the {𝑥 } being bounded implies {𝑦 } being
𝑆 , thus by 𝑥 → 𝑥, 𝑦 →𝑦 ⇒𝑥 +𝑦 → 𝑥 + 𝑦, which implies 𝑧 = 𝑥 + 𝑦 (a subsequence, if it converges, bounded.
must converges to the limit of the whole sequence). This means 𝑧 ∈ 𝑆 + 𝑆 and completes the proof.
 Property 17-12 𝑆 ⋂𝑆 ⊆ 𝑆 ⋂𝑆 . For any 𝑥 ∈ 𝑆 ⋂𝑆 , let 𝑥 → 𝑥 where {𝑥 } ⊂ 𝑆 ⋂𝑆 , then {𝑥 } ⊂
𝑆 ⇒ 𝑥 ∈ 𝑆 , and {𝑥 } ⊂ 𝑆 ⇒ 𝑥 ∈ 𝑆 , implying 𝑥 ∈ 𝑆 ⋂𝑆 .
However, 𝑆 ⋂𝑆 ≠ 𝑆 ⋂𝑆 in general. Consider 𝑆 = (0,1) and 𝑆 = (1,2) , then 𝑆 ⋂𝑆 = ∅ , but
𝑆 ⋂𝑆 = {1}. Property 17-22 proves the equality holds for two convex sets when they have non-empty
interior intersection.
 Theorem 17-2 Line segment principal. Let 𝐶 be a non-empty convex set, if 𝑥 ∈ ri 𝐶 and 𝑦 ∈ 𝐶̅ , then
[𝑥, 𝑦) ⊆ ri 𝐶. In other words, for any line segments in 𝐶̅ , if one of the points is in ri 𝐶, then at most the
other end point is not in ri 𝐶 (but at the boundary of 𝐶).
By definition, ∃𝜖 > 0 s.t. 𝐵 (𝑥)⋂𝐴(𝐶) ⊆ 𝐶 . If 𝑦 ∈ 𝐶 as well,
𝑦 then for any 𝜃 ∈ (0,1) and 𝑧 = 𝜃𝑥 + (1 − 𝜃)𝑦, draw a ball
𝐵 (𝑧 ) , and ∀𝑝 ∈ 𝐵 (𝑧 ) we have 𝑝 = 𝑧 + 𝜃𝐰 for some
‖𝐰‖ < 𝜖, then 𝑥 + 𝐰 ∈ 𝐵 (𝑥)⋂𝐴(𝐶) and
𝑝 𝜃(𝑥 + 𝐰) + (1 − 𝜃)𝑦 = 𝑝 ⇒ 𝑝 ∈ 𝐶 Commented [TC14]: 𝜃(𝑥 + 𝐰) + (1 − 𝜃)𝑦 = 𝜃𝑥 +
𝑧 ⇒ 𝐵 (𝑧 )⋂𝐴(𝐶) ⊆ 𝐶 ⇒ 𝑧 ∈ ri 𝐶 (1 − 𝜃)𝑦 + 𝜃𝐰 = z + 𝜃𝐰 = 𝑝. This is consistent with the
𝑥+𝐰 geometric property of a triangle: [𝑝, 𝑧 ] is parallel with
Now for the general case of 𝑦 ∈ 𝐶̅ , let 𝑦 → 𝑦 where {𝑦 } ⊂ 𝐶. [𝑥, 𝑥 + 𝑤], then [
[ , ]
=[
[ , ]
=
[ , ]
= 𝜃.
𝑥 For any 𝜃, let 𝑧 , = 𝜃𝑥 + (1 − 𝜃)𝑦 , then 𝑧 , → 𝑧 as 𝑦 → 𝑦. , 𝐰] , 𝐰] [ . ]

By above discussion, 𝐵 𝑧 , ⋂𝐴(𝐶) ⊆ 𝐶, ∀𝑘. By limit, when


Commented [TC15]: Because 𝑝 is an arbitrary point in
𝑘 is large, we have 𝑧 , −𝑧 < , then 𝐵 (𝑧 )⋂𝐴(𝐶) ⊂ 𝐵 𝑧 , ⋂𝐴(𝐶) ⊆ 𝐶 ⇒ 𝑧 ∈ 𝐶. 𝐵 (𝑧 ).

A direct corollary is that ri 𝐶 is convex when 𝐶 is convex. For any two points 𝑥, 𝑦 ∈ ri 𝐶, we have both
(𝑥, 𝑦] ⊆ ri 𝐶 , [𝑥, 𝑦) ∈ ri 𝐶 ⇒ [𝑥, 𝑦] ∈ ri 𝐶, and thus ri 𝐶 is convex.
 Theorem 17-3 Let 𝐶 be a non-empty convex set, then ri 𝐶 ≠ ∅. If 𝐶 is a singleton, then 𝐴(𝐶) = 𝐶 and
it is easy to see the single point in 𝐶 is the sole interior point.
If 𝐶 contains at least two points, then dim 𝐴(𝐶) = 𝑛 ≥ 1. The strategy is to construct a non-empty
relative open set 𝐺 in 𝐶 s.t. every point in 𝐺 is an interior point of 𝐶. Let 𝑝 , … , 𝑝 be the maximum
affine independent set of 𝐶, then let

𝐺= 𝜆𝑝 : 𝜆 = 1, 𝜆 ∈ (0,1) = 𝑝 + 𝜆 𝑝 𝑝⃗ : 𝜆 < 1, 𝜆 ∈ (0,1)

where 𝑝 is chosen as the origin and 𝑝 𝑝⃗ form a basis of the induced vector space of 𝐴(𝐶). Note

Λ = (𝜆 , … , 𝜆 ): 𝜆 < 1, 𝜆 ∈ (0,1)

can be easily verified as an open set in 𝐹 w.r.t. any norm. Now for any 𝑥 ∈ 𝐺, where 𝑥 has a vector
coordinate 𝛌 w.r.t. origin 𝑝 , then for another point 𝑦 ∈ 𝐴(𝐶) sufficiently close to 𝑥, its coordinate
𝛌 would be sufficiently close to 𝛌 by Lemma 16-7 and Theorem 3-10, so that 𝛌 ∈ Λ, which in turn
gives 𝑦 ∈ 𝐺, then 𝐺 is relative open w.r.t. 𝐴(𝐶), which proves our original claim.
For above proof, recall that a set 𝐺 is open means for any point 𝑥 ∈ 𝐺 , if another point 𝑦 is
sufficiently close to 𝑥 w.r.t. to the defined metric, then we must have 𝑦 ∈ 𝐺; this is formalized as
the equivalent open ball argument. For a set 𝐺 open relative to another set 𝐻, it means for any
point 𝑥 ∈ 𝐺, if another point 𝑦 ∈ 𝐻 is sufficiently close to 𝑥 w.r.t. to the defined metric, then we
must have 𝑦 ∈ 𝐺.
.
Property 17-13 As a corollary, we have 𝐴(ri 𝐶) = 𝐴(𝐶). Let 𝑥 = 0.6𝑝 + ∑ 𝑝 ∈ 𝐺 and 𝑦 =
.
0.2𝑝 + ∑ 𝑝 ∈ 𝐺 , then 𝑝 = 2𝑥 − 𝑦 ∈ 𝐴(𝐺). Similarly, 𝑝 ∈ 𝐴(𝐺) for any 𝑖 = 0, … , 𝑛, and
this means 𝐴(𝐺) contains the maximum affine independent set 𝑝 , … , 𝑝 of 𝐶, so 𝐴 𝐴(𝐺) = 𝐴(𝐺) =
𝐴(𝐶) by Property 16-16. Since 𝐺 ⊆ ri 𝐶 ⊆ 𝐶 ⇒ 𝐴(𝐺) ⊆ 𝐴(ri 𝐶) ⊆ 𝐴(𝐶) , then 𝐴(𝐺) = 𝐴(ri 𝐶) = Commented [TC16]: 𝐺 and 𝐶 share a same maximum
𝐴(𝐶). affine independent set, then 𝐴(𝐺) = 𝐴(𝐶) are both span of
that maximum affine independent set.
 Theorem 17-4 Let 𝐶 be a non-empty convex set. A point 𝑥 is a relative interior point of 𝐶 iff ∀𝑦 ∈ 𝐶,
there exists 𝜃 > 0 s.t. 𝑥 − 𝜃𝑥𝑦⃗ ∈ 𝐶. In plain words, for any 𝑥, 𝑦 ∈ 𝐶, then there exists a sufficient small
movement of 𝑥 in the direction of 𝑦𝑥⃗ (the opposite direction of 𝑥𝑦⃗) s.t. it keeps 𝑥 in 𝐶. Note it is not
𝑥 + 𝜃𝑥𝑦⃗ , e.g. on interval [0,1] , 𝑥 = 0 satisfies ∀𝑦 ∈ [0,1], ∃𝜃 > 0, 𝑥 + 𝜃𝑥𝑦⃗ ∈ 𝐶 , but 𝑥 = 0 is a
boundary point.

Necessity. If 𝑥 ∈ ri 𝐶, clearly there exists a sufficiently small 𝜃 < ‖ s.t. 𝑥 − 𝜃𝑥𝑦⃗ lies in an open ball
⃗‖
𝐵 (𝑥) ⊆ ri 𝐶.
Sufficiency. Let 𝑥 be a point in 𝐶 s.t. ∃𝜃 > 0 s.t. 𝑧 = 𝑥 − 𝜃𝑥𝑦⃗ ∈ 𝐶, and let 𝑦 ∈ ri 𝐶. If 𝑥 = 𝑦 then we
are done. If 𝑦 ≠ 𝑥, then 𝑧 = 𝑥 + 𝜃(𝑥 − 𝑦) ⇒ 𝑥 = 𝑧+ 𝑦, where again we note 𝑧 ∈ 𝐶 and 𝑦 ∈
ri 𝐶, then by line segment principal Theorem 17-2, we have 𝑥 ∈ ri 𝐶.
Property 17-14 As a corollary, an affine subspace is always relative open to itself, i.e. ri 𝐴 = 𝐴 for
any affine subspace 𝐴 of 𝐴. Recall an affine subspace is a convex set. For any 𝑥, 𝑦 ∈ 𝐴 , 𝑥 − 𝜃𝑥𝑦⃗ =
𝑥 − 𝜃(𝑦 − 𝑥) = (1 + 𝜃)𝑥 − 𝜃𝑦 ∈ 𝐴 since it is an affine combination, then ri 𝐴 = 𝐴 since 𝑥 is an
arbitrary point in 𝐴 .
In general, some seemingly intuitive properties like “int 𝑆̅ = int 𝑆”, “𝑆̅ = ınt 𝑆”, “int 𝑆 = int 𝑆 ⟺ 𝑆 =
𝑆 ” do not hold. For example, on [0,1] ⊆ ℝ, the set of rational numbers ℚ[ , ] has empty interior, but
ℚ[ , ] = [0,1], then
int ℚ[ , ] = (0,1) ≠ ∅ = int ℚ[ , ]

ℚ[ , ] = [0,1] ≠ ∅ = ınt ℚ[ , ]

int ℚ ,
⋃ ,1 = , 1 = int , 1 but ℚ ,
⋃ , 1 = [0,1] ≠ ,1

We prove that such properties will hold for convex sets. In following discussion, letter 𝐶, 𝐶 , 𝐶 , etc.
denotes convex sets in 𝔸 = (𝐴, 𝑉), unless otherwise stated.
 Property 17-15 If 𝐶 ⊆ 𝐶 and dim 𝐶 = dim 𝐶 , then ri 𝐶 ⊆ ri 𝐶 . Note 𝐴(𝐶 ) = 𝐴(𝐶 ) since they
share some maximum affine independent set of 𝐶 , and so ∀𝑥 ∈ ri 𝐶 , ∃𝐵 (𝑥)⋂𝐴(𝐶 ) =
𝐵 (𝑥)⋂𝐴(𝐶 ) ⊆ 𝐶 ⊆ 𝐶 . However, if dim 𝐶 < dim 𝐶 , then the claim is false. Consider line segment
[(0,0), (0,1)] in ℝ whose relative interior is (0,0), (0,1) , but all points in the open interval are
boundary points of 𝐶([(0,0), (0,1)]⋃[(0,1), (1,1)]).
 Property 17-16 ri(𝐶 × 𝐶 ) = ri 𝐶 × ri 𝐶 . Let 𝑥 ∈ ri(𝐶 ) , 𝑦 ∈ ri(𝐶 ), then ∃𝜆, 𝜇 > 0 s.t. 𝑧 = 𝑥 −
𝜆𝑥𝑥⃗ ∈ 𝐶 and 𝑧 = 𝑦 − 𝜆𝑦𝑦⃗ ∈ 𝐶 for any 𝑥 ∈ 𝐶 , 𝑦 ∈ 𝐶 by Theorem 17-4, then

𝑧 ,𝑧 = 𝑥 − 𝜆𝑥𝑥⃗, 𝑦 − 𝜆𝑦𝑦⃗ = (𝑥, 𝑦) − 𝜆 𝑥𝑥⃗, 𝑦𝑦⃗ = (𝑥, 𝑦) − 𝜆(𝑥, 𝑦)(𝑥 , 𝑦 )⃗

where we note 𝑧 , 𝑧 , (𝑥 , 𝑦 ) ∈ 𝐶 × 𝐶 and so (𝑥, 𝑦) ∈ ri(𝐶 × 𝐶 ) , implying ri 𝐶 × ri 𝐶 ⊆


ri(𝐶 × 𝐶 ). The above prove is easily reversible, which completes the proof.
 Property 17-17 ri 𝐶̅ = ri 𝐶. For any 𝑥 ∈ ri 𝐶̅ , let 𝑦 ∈ ri 𝐶 whose existence is guaranteed by Theorem
17-3, then by Theorem 17-4 we have ∃𝜃 > 0 s.t. 𝑧 = 𝑥 − 𝜃𝑥𝑦⃗ ∈ 𝐶̅ , then 𝑥 = 𝑧+ 𝑦 where
again 𝑧 ∈ 𝐶̅ , 𝑦 ∈ ri 𝐶 , then 𝑥 ∈ ri 𝐶 by line segment principal. Conversely, for any 𝑥 ∈ ri 𝐶 ,
𝐵 (𝑥)⋂𝐴(𝐶) = 𝐵 (𝑥)⋂𝐴(𝐶̅ ) ⊆ 𝐶 ⊆ 𝐶̅ ⇒ 𝑥 ∈ ri 𝐶̅ where 𝐴(𝐶) = 𝐴(𝐶̅ ) is by Property 16-23.
 Property 17-18 rı 𝐶 = 𝐶̅ . ri 𝐶 ⊆ 𝐶 ⇒ rı 𝐶 ⊆ 𝐶̅ . Conversely, for any 𝑥 ∈ 𝐶̅ , let 𝑦 ∈ ri 𝐶. If 𝑥 = 𝑦, then
𝑥 ∈ ri 𝐶 ⇒ 𝑥 ∈ rı 𝐶 and we are done. If 𝑥 ≠ 𝑦 , by line segment principal Theorem 17-2, 𝑧 =
(1 − 𝜃)𝑥 + 𝜃𝑦 ∈ ri 𝐶 for any 𝜃 ∈ (0,1]. Since 𝑥 is the limit point of 𝑧 as 𝑘 → +∞, then 𝑥 ∈ rı 𝐶 .

 Property 17-19 ri 𝐶 = ri 𝐶 ⟺ 𝐶 = 𝐶 ⟺ ri 𝐶 ⊆ 𝐶 ⊆ 𝐶 . In plain words, this means if a convex set


𝐶 is in-between the interior of another convex set ri 𝐶 and its closure 𝐶 , then 𝐶 , 𝐶 have the same
interior and closure. First,
ri 𝐶 = ri 𝐶 ⇒ rı 𝐶 = 𝐶 = 𝐶 = rı 𝐶
Secondly,
ri 𝐶 = ri 𝐶 = ri 𝐶 = ri 𝐶 ⇒ ri 𝐶 ⊆ 𝐶
𝐶 =𝐶 ⇒ ⇒ ri 𝐶 ⊆ 𝐶 ⊆ 𝐶
𝐶 ⊆𝐶 =𝐶
Thirdly, note ri 𝐶 = ri 𝐶 ⊆ 𝐶 ⊆ 𝐶 means 𝐶 is just ri 𝐶 plus some limit points of the interior of 𝐶 ,
then by Theorem 16-12 dim 𝐶 = dim 𝐶 = dim ri 𝐶 , then by Property 17-15
𝐶 ⊆ 𝐶 ⇒ ri 𝐶 ⊆ ri 𝐶 = ri 𝐶
⇒ ri 𝐶 = ri 𝐶
ri 𝐶 = ri ri 𝐶 ⊆ ri 𝐶
 Property 17-20 Let 𝒜 be an affine map, then 𝒜(ri 𝐶) = ri 𝒜(𝐶) . Note by previous property and
Property 17-9, we have
ri 𝒜(ri 𝐶) ⊆ 𝒜(ri 𝐶) ⊆ 𝒜(𝐶) ⊆ 𝒜(𝐶̅ ) = 𝒜(rı 𝐶) ⊆ 𝒜(rı 𝐶)
⇒ 𝒜(ri 𝐶) = 𝒜(𝐶) ⇒ ri 𝒜(𝐶) = ri 𝒜(ri 𝐶) ⊆ 𝒜(ri 𝐶)
Conversely, for any 𝒜(𝑥) ∈ 𝒜(ri 𝐶) where 𝑥 is a preimage in ri 𝐶, let 𝒜(𝑦) ∈ 𝒜(𝐶) where 𝑦 is a
preimage in 𝐶, then by Theorem 17-4
𝑧 = 𝑥 − 𝜃𝑥𝑦⃗ ∈ 𝐶
⇒ 𝒜(𝑧) = 𝒜(𝑥 − 𝜃𝑦 + 𝜃𝑥) = 𝒜(𝑥) − 𝜃𝒜(𝑦) + 𝜃𝒜(𝑥) = 𝒜(𝑥) − 𝜃𝒜(𝑥)𝒜(𝑦)⃗
Again, note 𝒜(𝑧) ∈ 𝒜(𝐶), 𝒜(𝑦) ∈ 𝒜(𝐶), then 𝒜(𝑥) ∈ ri 𝒜(𝐶), and thus 𝒜(ri 𝐶) ⊆ ri 𝒜(𝐶).
 Property 17-21 ri(𝐶 + 𝐶 ) = ri 𝐶 + ri 𝐶 for any 𝐶 , 𝐶 ⊆ 𝔸 . Define linear map ℒ: 𝐹 → 𝐹 (see
also Property 17-11) as ℒ(𝑥, 𝑦) = 𝑥 + 𝑦 for any (𝑥, 𝑦) ∈ 𝐹 , then ℒ(𝐶 × 𝐶 ) = 𝐶 + 𝐶 , and so by
our previous property,
ri(𝐶 + 𝐶 ) = ri ℒ(𝐶 × 𝐶 ) = ℒ(ri 𝐶 × 𝐶 ) = ℒ(ri 𝐶 × ri 𝐶 ) = ri 𝐶 + ri 𝐶
 Property 17-22 ri 𝐶 ⋂ ri 𝐶 ⊆ ri(𝐶 ⋂𝐶 ). For any 𝑥 ∈ ri 𝐶 ⋂ ri 𝐶 , and any 𝑦 ∈ ri(𝐶 ⋂𝐶 ), ∃𝜃 , 𝜃 >
0 s.t. 𝑥 − 𝜃 𝑥𝑦⃗ ∈ 𝐶 and 𝑥 − 𝜃 𝑥𝑦⃗ ∈ 𝐶 , this means we can let 𝜃 = min{𝜃 , 𝜃 } and 𝑥 − 𝜃𝑥𝑦⃗ ∈
𝐶 ⋂𝐶 , then 𝑥 ∈ ri(𝐶 ⋂𝐶 ).
ri(𝐶 ⋂𝐶 ) = ri 𝐶 ⋂ ri 𝐶 if ri 𝐶 ⋂ ri 𝐶 ≠ ∅. For any 𝑦 ∈ 𝐶 ⋂𝐶 , choose some 𝑥 ∈ ri 𝐶 ⋂ ri 𝐶 , then
it is easy to see [𝑥, 𝑦) ⊆ ri 𝐶 , [𝑥, 𝑦) ⊆ ri 𝐶 ⇒ [𝑥, 𝑦) ⊆ ri 𝐶 ⋂ ri 𝐶 , and 𝑦 is a limit point of other
points in [𝑥, 𝑦), so 𝑦 ∈ rı 𝐶 ⋂ rı 𝐶 . Therefore 𝐶 ⋂𝐶 ⊆ rı 𝐶 ⋂ rı 𝐶 ⊆ 𝐶 ⋂𝐶 . Recall that 𝐶 ⋂𝐶 ⊆
𝐶 ⋂𝐶 by Property 17-12, then
𝐶 ⋂𝐶 = rı 𝐶 ⋂ rı 𝐶 = 𝐶 ⋂𝐶
⇒ ri(𝐶 ⋂𝐶 ) = ri 𝐶 ⋂𝐶 = ri rı 𝐶 ⋂ rı 𝐶 = ri(ri 𝐶 ⋂ ri 𝐶 ) ⊆ ri 𝐶 ⋂ ri 𝐶
If ri 𝐶 ⋂ ri 𝐶 = ∅ , then the equality might not hold. Consider two closed squares 𝐶 , 𝐶 in ℝ
adjacent on one edge, say 𝐶 = {0 ≤ 𝑥 ≤ 1,0 ≤ 𝑦 ≤ 1}, 𝐶 = {−1 ≤ 𝑥 ≤ 0,0 ≤ 𝑦 ≤ 1} , then
ri 𝐶 ⋂ ri 𝐶 = ∅, but 𝐶 ⋂𝐶 = {𝑥 = 0,0 ≤ 𝑦 ≤ 1} and ri(𝐶 ⋂𝐶 ) = {𝑥 = 0,0 < 𝑦 < 1}.
 Property 17-23 Let 𝒜: 𝔸 → 𝔹 be an affine map between two affine spaces, and let 𝒜 (𝑆)
denote the preimage of 𝑆 ⊆ 𝔹 under affine map 𝒜 (some element in 𝑆 might not have preimage).
Let 𝐶 be a convex set in 𝔹, if 𝒜 (ri 𝐶) ≠ ∅, then 𝒜 (ri 𝐶) = ri 𝒜 (𝐶). Commented [TC17]: Here 𝐶 is a convex set in 𝔹.

Define 𝐷 = 𝐴 × 𝐶, 𝑆 = 𝐴 × 𝒜(𝐴) = 𝑥, 𝒜(𝑥) : 𝑥 ∈ 𝐴 , let 𝒯: 𝐴 × 𝐵 → 𝐴 be an affine map


(see EX 32) defined by 𝒯(𝑥, 𝑦) = 𝑥. Verify that 𝑥, 𝒜(𝑥) : 𝒜(𝑥) ∈ 𝐶 = 𝐷⋂𝑆, then
𝒜 (𝐶) = {𝑥: 𝒜(𝑥) ∈ 𝐶} = 𝒯 𝑥, 𝒜(𝑥) : 𝒜(𝑥) ∈ 𝐶 = 𝒯(𝐷⋂𝑆)
⇒ ri 𝒜 (𝐶) = ri 𝒯(𝐷⋂𝑆)
Similarly, 𝑥, 𝒜(𝑥) : 𝒜(𝑥) ∈ ri 𝐶 = (ri 𝐷)⋂𝑆 because ri 𝐷 = ri 𝐴 × 𝐶 = ri 𝐴 × ri 𝐶 = 𝐴 ×
ri 𝐶 by Property 17-14, then 𝒜 (ri 𝐶) = 𝒯(ri 𝐷 ⋂𝑆). Commented [TC18]: The relative interior of an affine
space 𝐴 is itself, i.e. ri 𝐴 = 𝐴.
Note ri 𝑆 = ri 𝐴 × ri 𝒜(𝐴) = 𝐴 × 𝒜(ri 𝐴) = 𝐴 × 𝒜(𝐴) = 𝑆, and by Property 17-16 we further
have
ri 𝒯(𝐷⋂𝑆) = 𝒯(ri(𝐷⋂𝑆)) = 𝒯(ri 𝐷 ⋂ ri 𝑆) = 𝒯(ri 𝐷 ⋂𝑆) ⇒ ri 𝒜 (𝐶) = 𝒜 (ri 𝐶) Commented [TC19]: 𝒜 (ri 𝐶) ≠ ∅ ensures this
ri 𝐷 ⋂ ri 𝑆 ≠ ∅, so then Property 79 holds.
 Property 17-24 Given two affine spaces 𝔸 = (𝐴, 𝑉), 𝔹 = (𝐵, 𝑊), let 𝐶 be a convex set in 𝔸 × 𝔹,
let 𝐶 (𝑥) = {𝑦: (𝑥, 𝑦) ∈ 𝐶}, and 𝐶 = {𝑥: 𝐶 (𝑥) ≠ ∅}, then 𝐶 = {(𝑥, 𝑦): 𝑥 ∈ 𝐶 , 𝑦 ∈ 𝐶 (𝑥)} and
we claim ri(𝐶) = {(𝑥, 𝑦): 𝑥 ∈ ri 𝐶 , 𝑦 ∈ ri 𝐶 (𝑥)} . This is an extension of Property 17-16
ri(𝐶 × 𝐶 ) = ri 𝐶 × ri 𝐶 to “non-cube-like” convex sets, since if 𝐶 = 𝐶 × 𝐶 , we have ri 𝐶 =
ri 𝐶 and ri 𝐶 (𝑥) = ri 𝐶 , ∀𝑥.

𝐶 (𝑥) 𝐶

𝐶
𝑥
𝔸

Note 𝐶 = {𝑥: ∃𝑦, (𝑥, 𝑦) ∈ 𝐶}. Let 𝒜 be the projection of points in 𝔸 × 𝔹 onto 𝔸, then it is easy
to verify 𝒜(𝐶) = 𝐶 . By Property 17-20 we can exchange the projection and interior operator,
and find
ri 𝐶 = ri 𝒜(𝐶) = 𝒜(ri 𝐶) = {𝑥: ∃𝑦, (𝑥, 𝑦) ∈ ri 𝐶}
Let 𝑀 = {𝑥} × 𝐵 = {(𝑥, 𝑦): 𝑦 ∈ 𝐵} for any 𝑥 ∈ ri 𝐶 , then the projection of 𝑀 ⋂𝐶 on 𝔹 is
clearly 𝐶 (𝑥) since
𝑀 ⋂𝐶 = ({𝑥} × 𝐵)⋂({𝑥} × 𝐶 ) = {𝑥} × 𝐶 (𝑥) = {(𝑥, 𝑦): 𝑦 ∈ 𝐶 (𝑥)}
Note 𝑀 is an affine subspace so ri 𝑀 = 𝑀 by Property 17-14, then
ri(𝑀 ⋂𝐶) = ri 𝑀 ⋂ ri 𝐶 = 𝑀 ⋂ ri 𝐶 = {(𝑥, 𝑦): 𝑦 ∈ ri 𝐶 (𝑥)}
⇒ 𝑦 ∈ ri 𝐶 (𝑥) ⟺ (𝑥, 𝑦) ∈ ri(𝑀 ⋂𝐶) , ∀𝑥 ∈ ri 𝐶
Further verify that ri 𝐶 = ⋃ ∈ (𝑀 ⋂ ri 𝐶) . (𝑥, 𝑦) ∈ ri 𝐶 means 𝑥 ∈ ri 𝐷 and so (𝑥, 𝑦) ∈ 𝑀 ,
thus (𝑥, 𝑦) ∈ 𝑀 ⋂ ri 𝐶 ; conversely (𝑥, 𝑦) ∈ ⋃ ∈ (𝑀 ⋂ ri 𝐶) directly implies (𝑥, 𝑦) ∈ ri 𝐶 .
Finally,

ri 𝐶 = 𝑀 ri 𝐶 = ({(𝑥, 𝑦): 𝑦 ∈ ri 𝐶 (𝑥)}) = {(𝑥, 𝑦): 𝑥 ∈ ri 𝐶 , 𝑦 ∈ ri 𝐶 (𝑥)}


∈ ∈

Cone
A point in the conical hull of a convex set can be written in a much simpler form.
Any direction in the hyperplane is in the linearity space of the two halfspaces.
Vector sum of two polyhedrons is a polyhedron.
Sum of two cone is a cone.
Intersection of two cones is a cone.
If a hyperplane contains the entire relative boundary of a convex set 𝐶, and either it contains entire 𝐶, or
it contains 𝐶 in one closed halfspace. Further, it one of its closed halfspace contains some point in ri 𝐶,
then 𝐶 is entirely contained in that halfspace.

You might also like