Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction
The discrete case
Measures
The Euclidean case
Gradient flows, optimal transport,
and evolution PDE’s
2 - A quick introduction to Optimal Transport
Giuseppe Savaré
http://www.imati.cnr.it/∼savare
Dipartimento di Matematica, Università di Pavia
GNFM Summer School
Ravello, September 13–18, 2010
1
Introduction
The discrete case
Measures
The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
2
Introduction
The discrete case
Measures
The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
Introduction
The discrete case
Measures
The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
Introduction
The discrete case
Measures
The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
Introduction
The discrete case
Measures
The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
3
Introduction
The discrete case
Measures
The Euclidean case
Gaspard Monge (1746-1818)
42
3 The founding fathers of optimal transport
1781: “La théorie des déblais et des remblais ”
minimize the total cost. Monge assumed that the transport cost of one
unit of mass along a certain distance was given by the product of the
Problem: how to transport soil from the groud to a given configuration in the
mass
byefficient”
the distance.
“most
way.
T
x
déblais
y
remblais
Fig. 3.1. Monge’s problem of déblais and remblais
The transport cost is proportional to the distance |T (x) − x|.
Nowadays there is a Monge street in Paris, and therein one can find
4
Introduction
The discrete case
Measures
The Euclidean case
Leonid Kantorovich (1912-1986)
1939: Mathematical Methods of Organizing and
Planning of Production,
(unpublished until 1960).
1942: On the translocation of masses
1948: On a problem of Monge
1975: Nobel prize, jointly with Tjalling Koopmans,
“for their contributions to the theory of optimum allocation of resources”
Autobiography:
http://nobelprize.org/nobel prizes/economics/laureates/1975/kantorovich-autobio.html
Parallel contributions:
1941: Frank Hitchcock, The distribution of a product from several sources to
numerous localities (Jour. Math. Phys.)
1947: Tjalling Koopmans, Optimum utilization of the transportation system.
1947: George Dantzig, simplex method.
5
Introduction
The discrete case
Measures
The Euclidean case
Leonid Kantorovich (1912-1986)
1939: Mathematical Methods of Organizing and
Planning of Production,
(unpublished until 1960).
1942: On the translocation of masses
1948: On a problem of Monge
1975: Nobel prize, jointly with Tjalling Koopmans,
“for their contributions to the theory of optimum allocation of resources”
Autobiography:
http://nobelprize.org/nobel prizes/economics/laureates/1975/kantorovich-autobio.html
Parallel contributions:
1941: Frank Hitchcock, The distribution of a product from several sources to
numerous localities (Jour. Math. Phys.)
1947: Tjalling Koopmans, Optimum utilization of the transportation system.
1947: George Dantzig, simplex method.
Introduction
The discrete case
Measures
The Euclidean case
Leonid Kantorovich (1912-1986)
1939: Mathematical Methods of Organizing and
Planning of Production,
(unpublished until 1960).
1942: On the translocation of masses
1948: On a problem of Monge
1975: Nobel prize, jointly with Tjalling Koopmans,
“for their contributions to the theory of optimum allocation of resources”
Autobiography:
http://nobelprize.org/nobel prizes/economics/laureates/1975/kantorovich-autobio.html
Parallel contributions:
1941: Frank Hitchcock, The distribution of a product from several sources to
numerous localities (Jour. Math. Phys.)
1947: Tjalling Koopmans, Optimum utilization of the transportation system.
1947: George Dantzig, simplex method.
Introduction
The discrete case
Measures
The Euclidean case
Twoards the recent theory...
I
Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . )
I
Rachev-Rüschendorf, Mass Transportation Problems (1998)
Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I
Yann Brenier (’89): fluid mechanics, transport
map, polar decomposition. Dynamical interpratation
of optimal transport.
I
John Mather: Lagrangian dynamical systems.
Mike Cullen: meteorologic models, semigeostrofic equations.
Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:
L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.
Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,
T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I
I
I
C. Villani: Optimal transport: Old and New
Springer (2009) 978 p.
6
Introduction
The discrete case
Measures
The Euclidean case
Twoards the recent theory...
I
Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . )
I
Rachev-Rüschendorf, Mass Transportation Problems (1998)
Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I
Yann Brenier (’89): fluid mechanics, transport
map, polar decomposition. Dynamical interpratation
of optimal transport.
I
John Mather: Lagrangian dynamical systems.
Mike Cullen: meteorologic models, semigeostrofic equations.
Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:
L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.
Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,
T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I
I
I
C. Villani: Optimal transport: Old and New
Springer (2009) 978 p.
Introduction
The discrete case
Measures
The Euclidean case
Twoards the recent theory...
I
Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . )
I
Rachev-Rüschendorf, Mass Transportation Problems (1998)
Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I
Yann Brenier (’89): fluid mechanics, transport
map, polar decomposition. Dynamical interpratation
of optimal transport.
I
John Mather: Lagrangian dynamical systems.
Mike Cullen: meteorologic models, semigeostrofic equations.
Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:
L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.
Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,
T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I
I
I
C. Villani: Optimal transport: Old and New
Springer (2009) 978 p.
Introduction
The discrete case
Measures
The Euclidean case
Twoards the recent theory...
I
Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . )
I
Rachev-Rüschendorf, Mass Transportation Problems (1998)
Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I
Yann Brenier (’89): fluid mechanics, transport
map, polar decomposition. Dynamical interpratation
of optimal transport.
I
John Mather: Lagrangian dynamical systems.
Mike Cullen: meteorologic models, semigeostrofic equations.
Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:
L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.
Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,
T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I
I
I
C. Villani: Optimal transport: Old and New
Springer (2009) 978 p.
Introduction
The discrete case
Measures
The Euclidean case
Twoards the recent theory...
I
Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . )
I
Rachev-Rüschendorf, Mass Transportation Problems (1998)
Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I
Yann Brenier (’89): fluid mechanics, transport
map, polar decomposition. Dynamical interpratation
of optimal transport.
I
John Mather: Lagrangian dynamical systems.
Mike Cullen: meteorologic models, semigeostrofic equations.
Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:
L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.
Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,
T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I
I
I
C. Villani: Optimal transport: Old and New
Springer (2009) 978 p.
Introduction
The discrete case
Measures
The Euclidean case
Twoards the recent theory...
I
Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . )
I
Rachev-Rüschendorf, Mass Transportation Problems (1998)
Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I
Yann Brenier (’89): fluid mechanics, transport
map, polar decomposition. Dynamical interpratation
of optimal transport.
I
John Mather: Lagrangian dynamical systems.
Mike Cullen: meteorologic models, semigeostrofic equations.
Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:
L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.
Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,
T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I
I
I
C. Villani: Optimal transport: Old and New
Springer (2009) 978 p.
Introduction
The discrete case
Measures
The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
7
Introduction
The discrete case
Measures
The Euclidean case
Discrete formulation
• Initial configuration of resources in X =
{x1 , · · · , xh }; at every point xi ∈ X it
is available the quantity mi = m(xi ).
• Final configuration Y = {y1 , · · · , yn }:
at every point yj the quantity nj =
n(yj ) is expected.
• The unitary cost cij = c(xi , yj ) for
transporting the single unit from position xi to the destination yj .
y1
x1
x2
y2
x3
y3
x4
Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved
from xi to yj , so that
T (xi , yj ) ≥ 0,
X
T (xi , y) = m(xi ),
y∈Y
The cost of the transference plan T is C(T ) :=
X
T (x, yj ) = n(yj )
x∈X
X
c(x, y)T (x, y)
x∈X,y∈Y
8
Introduction
The discrete case
Measures
The Euclidean case
Discrete formulation
• Initial configuration of resources in X =
{x1 , · · · , xh }; at every point xi ∈ X it
is available the quantity mi = m(xi ).
• Final configuration Y = {y1 , · · · , yn }:
at every point yj the quantity nj =
n(yj ) is expected.
• The unitary cost cij = c(xi , yj ) for
transporting the single unit from position xi to the destination yj .
y1
x1
x2
y2
x3
y3
x4
Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved
from xi to yj , so that
T (xi , yj ) ≥ 0,
X
T (xi , y) = m(xi ),
y∈Y
The cost of the transference plan T is C(T ) :=
X
T (x, yj ) = n(yj )
x∈X
X
c(x, y)T (x, y)
x∈X,y∈Y
8
Introduction
The discrete case
Measures
The Euclidean case
Discrete formulation
• Initial configuration of resources in X =
{x1 , · · · , xh }; at every point xi ∈ X it
is available the quantity mi = m(xi ).
• Final configuration Y = {y1 , · · · , yn }:
at every point yj the quantity nj =
n(yj ) is expected.
• The unitary cost cij = c(xi , yj ) for
transporting the single unit from position xi to the destination yj .
c11
x1
y1
c12
x2
c13
y2
x3
y3
x4
Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved
from xi to yj , so that
T (xi , yj ) ≥ 0,
X
T (xi , y) = m(xi ),
y∈Y
The cost of the transference plan T is C(T ) :=
X
T (x, yj ) = n(yj )
x∈X
X
c(x, y)T (x, y)
x∈X,y∈Y
8
Introduction
The discrete case
Measures
The Euclidean case
Discrete formulation
• Initial configuration of resources in X =
{x1 , · · · , xh }; at every point xi ∈ X it
is available the quantity mi = m(xi ).
• Final configuration Y = {y1 , · · · , yn }:
at every point yj the quantity nj =
n(yj ) is expected.
• The unitary cost cij = c(xi , yj ) for
transporting the single unit from position xi to the destination yj .
y1
x1
c21
x2
c22
y2
c23
x3
y3
x4
Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved
from xi to yj , so that
T (xi , yj ) ≥ 0,
X
T (xi , y) = m(xi ),
y∈Y
The cost of the transference plan T is C(T ) :=
X
T (x, yj ) = n(yj )
x∈X
X
c(x, y)T (x, y)
x∈X,y∈Y
8
Introduction
The discrete case
Measures
The Euclidean case
Discrete formulation
• Initial configuration of resources in X =
{x1 , · · · , xh }; at every point xi ∈ X it
is available the quantity mi = m(xi ).
• Final configuration Y = {y1 , · · · , yn }:
at every point yj the quantity nj =
n(yj ) is expected.
• The unitary cost cij = c(xi , yj ) for
transporting the single unit from position xi to the destination yj .
T11
x1
y1
T21
x2
y2
T33
x3
T42
x4
y3
T43
Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved
from xi to yj , so that
T (xi , yj ) ≥ 0,
X
T (xi , y) = m(xi ),
y∈Y
The cost of the transference plan T is C(T ) :=
X
T (x, yj ) = n(yj )
x∈X
X
c(x, y)T (x, y)
x∈X,y∈Y
8
Introduction
The discrete case
Measures
The Euclidean case
Discrete formulation
• Initial configuration of resources in X =
{x1 , · · · , xh }; at every point xi ∈ X it
is available the quantity mi = m(xi ).
• Final configuration Y = {y1 , · · · , yn }:
at every point yj the quantity nj =
n(yj ) is expected.
• The unitary cost cij = c(xi , yj ) for
transporting the single unit from position xi to the destination yj .
y1
x1
x2
y2
x3
y3
x4
Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved
from xi to yj , so that
T (xi , yj ) ≥ 0,
X
T (xi , y) = m(xi ),
y∈Y
The cost of the transference plan T is C(T ) :=
X
T (x, yj ) = n(yj )
x∈X
X
c(x, y)T (x, y)
x∈X,y∈Y
8
Introduction
The discrete case
Measures
The Euclidean case
Discrete formulation
• Initial configuration of resources in X =
{x1 , · · · , xh }; at every point xi ∈ X it
is available the quantity mi = m(xi ).
• Final configuration Y = {y1 , · · · , yn }:
at every point yj the quantity nj =
n(yj ) is expected.
• The unitary cost cij = c(xi , yj ) for
transporting the single unit from position xi to the destination yj .
y1
x1
x2
y2
x3
y3
x4
Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved
from xi to yj , so that
T (xi , yj ) ≥ 0,
X
T (xi , y) = m(xi ),
y∈Y
The cost of the transference plan T is C(T ) :=
X
T (x, yj ) = n(yj )
x∈X
X
c(x, y)T (x, y)
x∈X,y∈Y
8
Introduction
The discrete case
Measures
The Euclidean case
Optimal transport
Problem
Find the best transference plan T which minimizes the cost C(T ) among all the
admissible plans.
The linear programming structure: given positive coefficients mi , nj and ci,j
find the quantities Ti,j minimizing the linear functional
C(T ) =
X
ci,j Ti,j
i,j
under the linear/convex constraints
X
Ti,j ≥ 0,
Ti,j = mi ,
j
X
Ti,j = mj
i
In vector notation:
~ · T~ :
min C
A0 T~ ≥ 0,
A1 T~ = ~b
In the discrete case existence of the optimal plan is easy; more important are
3 foundamental properties:
I
Cyclical monotonicity of the optimal transference plan.
I
Dual characterization, Kantorovich potentials (prices in economic terms),
linear programming.
I
Integrality of the transference plan, transport maps.
9
Introduction
The discrete case
Measures
The Euclidean case
Optimal transport
Problem
Find the best transference plan T which minimizes the cost C(T ) among all the
admissible plans.
The linear programming structure: given positive coefficients mi , nj and ci,j
find the quantities Ti,j minimizing the linear functional
C(T ) =
X
ci,j Ti,j
i,j
under the linear/convex constraints
X
Ti,j ≥ 0,
Ti,j = mi ,
j
X
Ti,j = mj
i
In vector notation:
~ · T~ :
min C
A0 T~ ≥ 0,
A1 T~ = ~b
In the discrete case existence of the optimal plan is easy; more important are
3 foundamental properties:
I
Cyclical monotonicity of the optimal transference plan.
I
Dual characterization, Kantorovich potentials (prices in economic terms),
linear programming.
I
Integrality of the transference plan, transport maps.
Introduction
The discrete case
Measures
The Euclidean case
Optimal transport
Problem
Find the best transference plan T which minimizes the cost C(T ) among all the
admissible plans.
The linear programming structure: given positive coefficients mi , nj and ci,j
find the quantities Ti,j minimizing the linear functional
C(T ) =
X
ci,j Ti,j
i,j
under the linear/convex constraints
X
Ti,j ≥ 0,
Ti,j = mi ,
j
X
Ti,j = mj
i
In vector notation:
~ · T~ :
min C
A0 T~ ≥ 0,
A1 T~ = ~b
In the discrete case existence of the optimal plan is easy; more important are
3 foundamental properties:
I
Cyclical monotonicity of the optimal transference plan.
I
Dual characterization, Kantorovich potentials (prices in economic terms),
linear programming.
I
Integrality of the transference plan, transport maps.
Introduction
The discrete case
Measures
The Euclidean case
Cyclical monotonicity
Consider an aribtrary collection of couples (x, y) joined by a transport ray , i.e.
T (x, y) > 0: in the picture (x2 , y1 ), (x3 , y2 ), (x4 , y3 )
T11
x1
x2
y1
T21
y2
T33
x3
T42
x4
y3
T43
The associated (unitary) cost is
c(x2 , y1 ) + c(x3 , y2 ) + c(x4 , y3 ) ≤ c(x2 , y2 ) + c(x3 , y3 ) + c(x4 , y1 )
if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1
Theorem (Rachev-Ruschendorf )
If T is optimal the cost of any rearranged configuration by a cyclical permutation
cannot decrease.
10
Introduction
The discrete case
Measures
The Euclidean case
Cyclical monotonicity
Consider an aribtrary collection of couples (x, y) joined by a transport ray , i.e.
T (x, y) > 0: in the picture (x2 , y1 ), (x3 , y2 ), (x4 , y3 )
y1
x2
y2
x3
y3
x4
The associated (unitary) cost is
c(x2 , y1 ) + c(x3 , y2 ) + c(x4 , y3 ) ≤ c(x2 , y2 ) + c(x3 , y3 ) + c(x4 , y1 )
if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1
Theorem (Rachev-Ruschendorf )
If T is optimal the cost of any rearranged configuration by a cyclical permutation
cannot decrease.
10
Introduction
The discrete case
Measures
The Euclidean case
Cyclical monotonicity
Consider an aribtrary collection of couples (x, y) joined by a transport ray , i.e.
T (x, y) > 0: in the picture (x2 , y1 ), (x3 , y2 ), (x4 , y3 )
y1
x2
y2
x3
y3
x4
The associated (unitary) cost is
c(x2 , y1 ) + c(x3 , y2 ) + c(x4 , y3 ) ≤ c(x2 , y2 ) + c(x3 , y3 ) + c(x4 , y1 )
if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1
Theorem (Rachev-Ruschendorf )
If T is optimal the cost of any rearranged configuration by a cyclical permutation
cannot decrease.
10
Introduction
The discrete case
Measures
The Euclidean case
Cyclical monotonicity
Consider an aribtrary collection of couples (x, y) joined by a transport ray , i.e.
T (x, y) > 0: in the picture (x2 , y1 ), (x3 , y2 ), (x4 , y3 )
y1
σ
x2
y2
x3
σ
y3
x4
The associated (unitary) cost is
c(x2 , y1 ) + c(x3 , y2 ) + c(x4 , y3 ) ≤ c(x2 , y2 ) + c(x3 , y3 ) + c(x4 , y1 )
if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1
Theorem (Rachev-Ruschendorf )
If T is optimal the cost of any rearranged configuration by a cyclical permutation
cannot decrease.
10
Introduction
The discrete case
Measures
The Euclidean case
Cyclical monotonicity
Consider an aribtrary collection of couples (x, y) joined by a transport ray , i.e.
T (x, y) > 0: in the picture (x2 , y1 ), (x3 , y2 ), (x4 , y3 )
y1
σ
x2
y2
x3
σ
y3
x4
The associated (unitary) cost is
c(x2 , y1 ) + c(x3 , y2 ) + c(x4 , y3 ) ≤ c(x2 , y2 ) + c(x3 , y3 ) + c(x4 , y1 )
if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1
Theorem (Rachev-Ruschendorf )
If T is optimal the cost of any rearranged configuration by a cyclical permutation
cannot decrease.
10
Introduction
The discrete case
Measures
The Euclidean case
Cyclical monotonicity is also sufficient
Theorem
If T is a cyclically monotone admissible plan then it is optimal.
11
Introduction
The discrete case
Measures
The Euclidean case
The dual problem: optimal prices
Linear programming: the dual problem gives a crucial insight on the structure
of the optimal transference plan.
Economic interpretation: a transport company offers to take care the
transportation job: they will pay the price u(x) to buy a unit placed at the
point x and they will sell it at y for the price v(y).
To be competitive, the prices should be more convenient than the transportation
cost c(x, y):
v(y) − u(x) ≤ c(x, y)
x ∈ X, y ∈ Y
(*)
The total profit for the company is
X
X
P(u, v) :=
n(y)v(y) −
m(x)u(x)
y∈Y
x∈X
and their problem is to find the prices which maximaize the profits
max P(u, v)
among all the competitive prices (u, v) satisfying (*)
Clearly C(T ) ≥ P(u, v) for every admissible trasnference plan T and every couple
of competitive prices u, v.
12
Introduction
The discrete case
Measures
The Euclidean case
The dual problem: optimal prices
Linear programming: the dual problem gives a crucial insight on the structure
of the optimal transference plan.
Economic interpretation: a transport company offers to take care the
transportation job: they will pay the price u(x) to buy a unit placed at the
point x and they will sell it at y for the price v(y).
To be competitive, the prices should be more convenient than the transportation
cost c(x, y):
v(y) − u(x) ≤ c(x, y)
x ∈ X, y ∈ Y
(*)
The total profit for the company is
X
X
P(u, v) :=
n(y)v(y) −
m(x)u(x)
y∈Y
x∈X
and their problem is to find the prices which maximaize the profits
max P(u, v)
among all the competitive prices (u, v) satisfying (*)
Clearly C(T ) ≥ P(u, v) for every admissible trasnference plan T and every couple
of competitive prices u, v.
12
Introduction
The discrete case
Measures
The Euclidean case
The dual problem: optimal prices
Linear programming: the dual problem gives a crucial insight on the structure
of the optimal transference plan.
Economic interpretation: a transport company offers to take care the
transportation job: they will pay the price u(x) to buy a unit placed at the
point x and they will sell it at y for the price v(y).
To be competitive, the prices should be more convenient than the transportation
cost c(x, y):
v(y) − u(x) ≤ c(x, y)
x ∈ X, y ∈ Y
(*)
The total profit for the company is
X
X
P(u, v) :=
n(y)v(y) −
m(x)u(x)
y∈Y
x∈X
and their problem is to find the prices which maximaize the profits
max P(u, v)
among all the competitive prices (u, v) satisfying (*)
Clearly C(T ) ≥ P(u, v) for every admissible trasnference plan T and every couple
of competitive prices u, v.
12
Introduction
The discrete case
Measures
The Euclidean case
The dual problem: optimal prices
Linear programming: the dual problem gives a crucial insight on the structure
of the optimal transference plan.
Economic interpretation: a transport company offers to take care the
transportation job: they will pay the price u(x) to buy a unit placed at the
point x and they will sell it at y for the price v(y).
To be competitive, the prices should be more convenient than the transportation
cost c(x, y):
v(y) − u(x) ≤ c(x, y)
x ∈ X, y ∈ Y
(*)
The total profit for the company is
X
X
P(u, v) :=
n(y)v(y) −
m(x)u(x)
y∈Y
x∈X
and their problem is to find the prices which maximaize the profits
max P(u, v)
among all the competitive prices (u, v) satisfying (*)
Clearly C(T ) ≥ P(u, v) for every admissible trasnference plan T and every couple
of competitive prices u, v.
12
Introduction
The discrete case
Measures
The Euclidean case
Duality theorem
Theorem (Min-max and “complementary slackness”)
An admissible transference plan T is optimal if and only if there exist competitive
prices (u, v) such that
C(T ) = P(u, v).
In particular
min C(T ) = max P(u, v).
T
(u,v)
Moreover, the “slackness”
S(x, y) := c(x, y) − u(x) − v(y) ≥ 0
satisfies the “complementary slackness principle”
T (x, y)S(x, y) = 0
i.e.
T (x, y) > 0 ⇒ S(x, y) = 0.
“If x and y are connected through an optimal transport ray then their respective
prices u(x) e v(y) are maximal: v(y) − u(x) = c(x, y).”
Introduction
The discrete case
Measures
The Euclidean case
Duality theorem
Theorem (Min-max and “complementary slackness”)
An admissible transference plan T is optimal if and only if there exist competitive
prices (u, v) such that
C(T ) = P(u, v).
In particular
min C(T ) = max P(u, v).
T
(u,v)
Moreover, the “slackness”
S(x, y) := c(x, y) − u(x) − v(y) ≥ 0
satisfies the “complementary slackness principle”
T (x, y)S(x, y) = 0
i.e.
T (x, y) > 0 ⇒ S(x, y) = 0.
“If x and y are connected through an optimal transport ray then their respective
prices u(x) e v(y) are maximal: v(y) − u(x) = c(x, y).”
Introduction
The discrete case
Measures
The Euclidean case
Duality theorem
Theorem (Min-max and “complementary slackness”)
An admissible transference plan T is optimal if and only if there exist competitive
prices (u, v) such that
C(T ) = P(u, v).
In particular
min C(T ) = max P(u, v).
T
(u,v)
Moreover, the “slackness”
S(x, y) := c(x, y) − u(x) − v(y) ≥ 0
satisfies the “complementary slackness principle”
T (x, y)S(x, y) = 0
i.e.
T (x, y) > 0 ⇒ S(x, y) = 0.
“If x and y are connected through an optimal transport ray then their respective
prices u(x) e v(y) are maximal: v(y) − u(x) = c(x, y).”
Introduction
The discrete case
Measures
The Euclidean case
Duality theorem
Theorem (Min-max and “complementary slackness”)
An admissible transference plan T is optimal if and only if there exist competitive
prices (u, v) such that
C(T ) = P(u, v).
In particular
min C(T ) = max P(u, v).
T
(u,v)
Moreover, the “slackness”
S(x, y) := c(x, y) − u(x) − v(y) ≥ 0
satisfies the “complementary slackness principle”
T (x, y)S(x, y) = 0
i.e.
T (x, y) > 0 ⇒ S(x, y) = 0.
“If x and y are connected through an optimal transport ray then their respective
prices u(x) e v(y) are maximal: v(y) − u(x) = c(x, y).”
Introduction
The discrete case
Measures
The Euclidean case
Duality via Von Neumann min-max
min
T
X
Ti,j ≥ 0,
ci,j Ti,j :
i,j
X
Ti,j = mi ,
j
X
Ti,j = nj .
i
Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint
min
T
X
i,j
ci,j Ti,j = min max
T
S,u,v
X
−
X
i,j
ui
S,u,v
X
= max
u,v
= max
u,v
T
j
i
“
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
= max min
X
Si,j Ti,j
” X “X
”
Ti,j − mi +
vj
Ti,j − mj
j
= min max
S,u,v
X
i,j
“X
i
T
ci,j Ti,j −
X
“
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
vj nj − ui mi :
ci,j − Si,j − ui − vj = 0
vj nj − ui mi :
ci,j − ui − vj ≥ 0.
i,j
X
i,j
14
Introduction
The discrete case
Measures
The Euclidean case
Duality via Von Neumann min-max
min
T
X
Ti,j ≥ 0,
ci,j Ti,j :
i,j
X
Ti,j = mi ,
j
X
Ti,j = nj .
i
Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint
min
T
X
i,j
ci,j Ti,j = min max
T
S,u,v
X
−
X
i,j
ui
S,u,v
X
= max
u,v
= max
u,v
T
j
i
“
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
= max min
X
Si,j Ti,j
” X “X
”
Ti,j − mi +
vj
Ti,j − mj
j
= min max
S,u,v
X
i,j
“X
i
T
ci,j Ti,j −
X
“
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
vj nj − ui mi :
ci,j − Si,j − ui − vj = 0
vj nj − ui mi :
ci,j − ui − vj ≥ 0.
i,j
X
i,j
14
Introduction
The discrete case
Measures
The Euclidean case
Duality via Von Neumann min-max
min
T
X
Ti,j ≥ 0,
ci,j Ti,j :
i,j
X
Ti,j = mi ,
j
X
Ti,j = nj .
i
Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint
min
T
X
i,j
ci,j Ti,j = min max
T
S,u,v
X
−
X
i,j
ui
S,u,v
X
= max
u,v
= max
u,v
T
j
“
i
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
= max min
X
Si,j Ti,j
” X “X
”
Ti,j − mi +
vj
Ti,j − mj
j
= min max
S,u,v
X
i,j
“X
i
T
ci,j Ti,j −
X
“
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
vj nj − ui mi :
ci,j − Si,j − ui − vj = 0
vj nj − ui mi :
ci,j − ui − vj ≥ 0.
i,j
X
i,j
14
Introduction
The discrete case
Measures
The Euclidean case
Duality via Von Neumann min-max
min
T
X
Ti,j ≥ 0,
ci,j Ti,j :
i,j
X
Ti,j = mi ,
j
X
Ti,j = nj .
i
Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint
min
T
X
i,j
ci,j Ti,j = min max
T
S,u,v
X
−
X
i,j
ui
S,u,v
X
= max
u,v
= max
u,v
T
j
“
i
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
= max min
X
Si,j Ti,j
” X “X
”
Ti,j − mi +
vj
Ti,j − mj
j
= min max
S,u,v
X
i,j
“X
i
T
ci,j Ti,j −
X
“
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
vj nj − ui mi :
ci,j − Si,j − ui − vj = 0
vj nj − ui mi :
ci,j − ui − vj ≥ 0.
i,j
X
i,j
14
Introduction
The discrete case
Measures
The Euclidean case
Duality via Von Neumann min-max
min
T
X
Ti,j ≥ 0,
ci,j Ti,j :
i,j
X
Ti,j = mi ,
j
X
Ti,j = nj .
i
Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint
min
T
X
i,j
ci,j Ti,j = min max
T
S,u,v
X
−
X
i,j
ui
S,u,v
X
= max
u,v
= max
u,v
T
j
“
i
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
= max min
X
Si,j Ti,j
” X “X
”
Ti,j − mi +
vj
Ti,j − mj
j
= min max
S,u,v
X
i,j
“X
i
T
ci,j Ti,j −
X
“
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
vj nj − ui mi :
ci,j − Si,j − ui − vj = 0
vj nj − ui mi :
ci,j − ui − vj ≥ 0.
i,j
X
i,j
14
Introduction
The discrete case
Measures
The Euclidean case
Duality via Von Neumann min-max
min
T
X
Ti,j ≥ 0,
ci,j Ti,j :
i,j
X
Ti,j = mi ,
j
X
Ti,j = nj .
i
Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint
min
T
X
i,j
ci,j Ti,j = min max
T
S,u,v
X
−
X
i,j
ui
S,u,v
X
= max
u,v
= max
u,v
T
j
“
i
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
= max min
X
Si,j Ti,j
” X “X
”
Ti,j − mi +
vj
Ti,j − mj
j
= min max
S,u,v
X
i,j
“X
i
T
ci,j Ti,j −
X
“
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
vj nj − ui mi :
ci,j − Si,j − ui − vj = 0
vj nj − ui mi :
ci,j − ui − vj ≥ 0.
i,j
X
i,j
14
Introduction
The discrete case
Measures
The Euclidean case
Duality via Von Neumann min-max
min
T
X
Ti,j ≥ 0,
ci,j Ti,j :
i,j
X
Ti,j = mi ,
j
X
Ti,j = nj .
i
Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint
min
T
X
i,j
ci,j Ti,j = min max
T
S,u,v
X
−
X
i,j
ui
S,u,v
X
= max
u,v
= max
u,v
T
j
“
i
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
= max min
X
Si,j Ti,j
” X “X
”
Ti,j − mi +
vj
Ti,j − mj
j
= min max
S,u,v
X
i,j
“X
i
T
ci,j Ti,j −
X
“
”
Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi
i,j
vj nj − ui mi :
ci,j − Si,j − ui − vj = 0
vj nj − ui mi :
ci,j − ui − vj ≥ 0.
i,j
X
i,j
14
Introduction
The discrete case
Measures
The Euclidean case
Integrality
Theorem
If the initial and final configuration m(x), n(y) ∈ N are integers then there
exists an integer optimal transference plan T , i.e. T (x, y) ∈ N.
In other words, there is no need to split unitary quantities in order to realize the
optimal transport.
Corollary
If m(x) ≡ 1 and n(y) are integers, then the transference plan T is associated to
a transport map t : X → Y so that
T (x, y) > 0
⇔
y = t(x).
If moreover n(y) ≡ 1 then the map t is one-to-one.
Roughly speaking: from every point x ∈ X start a unique transport ray and
mass is not splitted in various directions.
Introduction
The discrete case
Measures
The Euclidean case
Integrality
Theorem
If the initial and final configuration m(x), n(y) ∈ N are integers then there
exists an integer optimal transference plan T , i.e. T (x, y) ∈ N.
In other words, there is no need to split unitary quantities in order to realize the
optimal transport.
Corollary
If m(x) ≡ 1 and n(y) are integers, then the transference plan T is associated to
a transport map t : X → Y so that
T (x, y) > 0
⇔
y = t(x).
If moreover n(y) ≡ 1 then the map t is one-to-one.
Roughly speaking: from every point x ∈ X start a unique transport ray and
mass is not splitted in various directions.
Introduction
The discrete case
Measures
The Euclidean case
Integrality
Theorem
If the initial and final configuration m(x), n(y) ∈ N are integers then there
exists an integer optimal transference plan T , i.e. T (x, y) ∈ N.
In other words, there is no need to split unitary quantities in order to realize the
optimal transport.
Corollary
If m(x) ≡ 1 and n(y) are integers, then the transference plan T is associated to
a transport map t : X → Y so that
T (x, y) > 0
⇔
y = t(x).
If moreover n(y) ≡ 1 then the map t is one-to-one.
Roughly speaking: from every point x ∈ X start a unique transport ray and
mass is not splitted in various directions.
Introduction
The discrete case
Measures
The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
16
Introduction
The discrete case
Measures
The Euclidean case
Measure data
I
I
I
X, Y discrete spaces
X, Y topological spaces (R, RN , locally compact
spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): here
RN .
The cost
a (lower-semi) continuous function c : X × Y → R ∪ {+∞}.
The initial and final configurations m(x), n(y)
a couple of Borel measures
µ, ν on X and Y . The mass is normalized to 1.
Given A ⊂ X, B ⊂ Y µ(A) denotes the quantity of resources available in
A, ν(B) denotes the resources expected in B.
Rm
Transport plan T
a measure γ on X ×
Y : γ(A × B) is the mass coming from A
and transported in B.
Admissibility: the marginals of γ are
thus fixed (γ is a coupling between µ and
ν)
γ(A × Y ) = µ(A),
ν
γ
γ(X × B) = ν(B)
|x − y| = 0
ν
Γ(µ, ν) : collection of all the admissible
trasnference plan/couplings.
µ
µ
Rm
The cost of a transference plan γ is
X
x,y
Z
c(x, y)T (x, y)
C(γ) :=
c(x, y) dγ(x, y).
X×Y
17
Introduction
The discrete case
Measures
The Euclidean case
Measure data
I
I
I
X, Y discrete spaces
X, Y topological spaces (R, RN , locally compact
spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): here
RN .
The cost
a (lower-semi) continuous function c : X × Y → R ∪ {+∞}.
The initial and final configurations m(x), n(y)
a couple of Borel measures
µ, ν on X and Y . The mass is normalized to 1.
Given A ⊂ X, B ⊂ Y µ(A) denotes the quantity of resources available in
A, ν(B) denotes the resources expected in B.
Rm
Transport plan T
a measure γ on X ×
Y : γ(A × B) is the mass coming from A
and transported in B.
Admissibility: the marginals of γ are
thus fixed (γ is a coupling between µ and
ν)
γ(A × Y ) = µ(A),
ν
γ
γ(X × B) = ν(B)
|x − y| = 0
ν
Γ(µ, ν) : collection of all the admissible
trasnference plan/couplings.
µ
µ
The cost of a transference plan γ is
X
x,y
Z
c(x, y)T (x, y)
C(γ) :=
c(x, y) dγ(x, y).
X×Y
Rm
Introduction
The discrete case
Measures
The Euclidean case
Measure data
I
I
I
X, Y discrete spaces
X, Y topological spaces (R, RN , locally compact
spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): here
RN .
The cost
a (lower-semi) continuous function c : X × Y → R ∪ {+∞}.
The initial and final configurations m(x), n(y)
a couple of Borel measures
µ, ν on X and Y . The mass is normalized to 1.
Given A ⊂ X, B ⊂ Y µ(A) denotes the quantity of resources available in
A, ν(B) denotes the resources expected in B.
Rm
Transport plan T
a measure γ on X ×
Y : γ(A × B) is the mass coming from A
and transported in B.
Admissibility: the marginals of γ are
thus fixed (γ is a coupling between µ and
ν)
γ(A × Y ) = µ(A),
ν
γ
γ(X × B) = ν(B)
|x − y| = 0
ν
Γ(µ, ν) : collection of all the admissible
trasnference plan/couplings.
µ
µ
The cost of a transference plan γ is
X
x,y
Z
c(x, y)T (x, y)
C(γ) :=
c(x, y) dγ(x, y).
X×Y
Rm
Introduction
The discrete case
Measures
The Euclidean case
Measure data
I
I
I
X, Y discrete spaces
X, Y topological spaces (R, RN , locally compact
spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): here
RN .
The cost
a (lower-semi) continuous function c : X × Y → R ∪ {+∞}.
The initial and final configurations m(x), n(y)
a couple of Borel measures
µ, ν on X and Y . The mass is normalized to 1.
Given A ⊂ X, B ⊂ Y µ(A) denotes the quantity of resources available in
A, ν(B) denotes the resources expected in B.
Rm
Transport plan T
a measure γ on X ×
Y : γ(A × B) is the mass coming from A
and transported in B.
Admissibility: the marginals of γ are
thus fixed (γ is a coupling between µ and
ν)
γ(A × Y ) = µ(A),
ν
γ
γ(X × B) = ν(B)
|x − y| = 0
ν
Γ(µ, ν) : collection of all the admissible
trasnference plan/couplings.
µ
µ
Rm
The cost of a transference plan γ is
X
x,y
Z
c(x, y)T (x, y)
C(γ) :=
c(x, y) dγ(x, y).
X×Y
17
Introduction
The discrete case
Measures
The Euclidean case
Measure data
I
I
I
X, Y discrete spaces
X, Y topological spaces (R, RN , locally compact
spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): here
RN .
The cost
a (lower-semi) continuous function c : X × Y → R ∪ {+∞}.
The initial and final configurations m(x), n(y)
a couple of Borel measures
µ, ν on X and Y . The mass is normalized to 1.
Given A ⊂ X, B ⊂ Y µ(A) denotes the quantity of resources available in
A, ν(B) denotes the resources expected in B.
Rm
Transport plan T
a measure γ on X ×
Y : γ(A × B) is the mass coming from A
and transported in B.
Admissibility: the marginals of γ are
thus fixed (γ is a coupling between µ and
ν)
γ(A × Y ) = µ(A),
ν
γ
γ(X × B) = ν(B)
|x − y| = 0
ν
Γ(µ, ν) : collection of all the admissible
trasnference plan/couplings.
µ
µ
Rm
The cost of a transference plan γ is
X
x,y
Z
c(x, y)T (x, y)
C(γ) :=
c(x, y) dγ(x, y).
X×Y
17
Introduction
The discrete case
Measures
The Euclidean case
Transport and probability
Discrete setting: {x1 , · · · , xN }, {m1 , · · · , mN }
µ=
transport map, yi = t(xi ),
X
t# µ = ν =
mi δyi .
In term of measures
X
ν(B) =
mi =
i:yi ∈ B
X
X
mi =
P
i
mi δxi . t:=
mi = µ(t−1 (B))
i:xi ∈t−1 (B )
i:t(xi )∈B
In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X)
we define
ν = t# µ ⇔ ν(B) = µ(t−1 (B)).
In probability: P is a probability measure on the probability space Ω,
X : Ω → X is a random variable,
X# P ∈ P(X )
is the law of X,
Change of variable formula:
Z
Z
φ(t(x)) dµ(x) =
X
X# P(A) = P[X ∈ A].
φ(y) dν(y)
Y
Z
Expectation:
Z
E[φ(X)] =
φ(X(ω)) dP(ω) =
Ω
φ(x) d(X# P)
X
18
Introduction
The discrete case
Measures
The Euclidean case
Transport and probability
Discrete setting: {x1 , · · · , xN }, {m1 , · · · , mN }
µ=
transport map, yi = t(xi ),
X
t# µ = ν =
mi δyi .
In term of measures
X
ν(B) =
mi =
i:yi ∈ B
X
X
mi =
P
i
mi δxi . t:=
mi = µ(t−1 (B))
i:xi ∈t−1 (B )
i:t(xi )∈B
In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X)
we define
ν = t# µ ⇔ ν(B) = µ(t−1 (B)).
In probability: P is a probability measure on the probability space Ω,
X : Ω → X is a random variable,
X# P ∈ P(X )
is the law of X,
Change of variable formula:
Z
Z
φ(t(x)) dµ(x) =
X
X# P(A) = P[X ∈ A].
φ(y) dν(y)
Y
Z
Expectation:
Z
E[φ(X)] =
φ(X(ω)) dP(ω) =
Ω
φ(x) d(X# P)
X
Introduction
The discrete case
Measures
The Euclidean case
Transport and probability
Discrete setting: {x1 , · · · , xN }, {m1 , · · · , mN }
µ=
transport map, yi = t(xi ),
X
t# µ = ν =
mi δyi .
In term of measures
X
ν(B) =
mi =
i:yi ∈ B
X
X
mi =
P
i
mi δxi . t:=
mi = µ(t−1 (B))
i:xi ∈t−1 (B )
i:t(xi )∈B
In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X)
we define
ν = t# µ ⇔ ν(B) = µ(t−1 (B)).
In probability: P is a probability measure on the probability space Ω,
X : Ω → X is a random variable,
X# P ∈ P(X )
is the law of X,
Change of variable formula:
Z
Z
φ(t(x)) dµ(x) =
X
X# P(A) = P[X ∈ A].
φ(y) dν(y)
Y
Z
Expectation:
Z
E[φ(X)] =
φ(X(ω)) dP(ω) =
Ω
φ(x) d(X# P)
X
Introduction
The discrete case
Measures
The Euclidean case
Transport and probability
Discrete setting: {x1 , · · · , xN }, {m1 , · · · , mN }
µ=
transport map, yi = t(xi ),
X
t# µ = ν =
mi δyi .
In term of measures
X
ν(B) =
mi =
i:yi ∈ B
X
X
mi =
P
i
mi δxi . t:=
mi = µ(t−1 (B))
i:xi ∈t−1 (B )
i:t(xi )∈B
In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X)
we define
ν = t# µ ⇔ ν(B) = µ(t−1 (B)).
In probability: P is a probability measure on the probability space Ω,
X : Ω → X is a random variable,
X# P ∈ P(X )
is the law of X,
Change of variable formula:
Z
Z
φ(t(x)) dµ(x) =
X
X# P(A) = P[X ∈ A].
φ(y) dν(y)
Y
Z
Expectation:
Z
E[φ(X)] =
φ(X(ω)) dP(ω) =
Ω
φ(x) d(X# P)
X
Introduction
The discrete case
Measures
The Euclidean case
Transport and probability
Discrete setting: {x1 , · · · , xN }, {m1 , · · · , mN }
µ=
transport map, yi = t(xi ),
X
t# µ = ν =
mi δyi .
In term of measures
X
ν(B) =
mi =
i:yi ∈ B
X
X
mi =
P
i
mi δxi . t:=
mi = µ(t−1 (B))
i:xi ∈t−1 (B )
i:t(xi )∈B
In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X)
we define
ν = t# µ ⇔ ν(B) = µ(t−1 (B)).
In probability: P is a probability measure on the probability space Ω,
X : Ω → X is a random variable,
X# P ∈ P(X )
is the law of X,
Change of variable formula:
Z
Z
φ(t(x)) dµ(x) =
X
X# P(A) = P[X ∈ A].
φ(y) dν(y)
Y
Z
Expectation:
Z
E[φ(X)] =
φ(X(ω)) dP(ω) =
Ω
φ(x) d(X# P)
X
Introduction
The discrete case
Measures
The Euclidean case
The general problem
Problem
Given two Borel probability measures µ ∈ P(X) and ν ∈ P(Y ) find an
admissible trasnference plan γ ∈ Γ(µ, ν) minimizing the toal cost
min
γ ∈Γ(µ,ν)
C(γ)
Kantorovich potentials: functions u : X → R, v : Y → R such that
v(y) − u(x) ≤ c(x, y)
X
x
u(x)m(x) +
X
Z
v(y)n(y)
(Π(c))
Z
P(u, v) :=
u(x) dµ(x) +
X
y
Problem (Dual formulation)
Find a couple of Kantorovich potentials (u, v) ∈ Π(c) maximizing
max P(u, v).
Π(c)
v(y) dν(y)
Y
Introduction
The discrete case
Measures
The Euclidean case
The general problem
Problem
Given two Borel probability measures µ ∈ P(X) and ν ∈ P(Y ) find an
admissible trasnference plan γ ∈ Γ(µ, ν) minimizing the toal cost
min
γ ∈Γ(µ,ν)
C(γ)
Kantorovich potentials: functions u : X → R, v : Y → R such that
v(y) − u(x) ≤ c(x, y)
X
x
u(x)m(x) +
X
Z
v(y)n(y)
(Π(c))
Z
P(u, v) :=
u(x) dµ(x) +
X
y
Problem (Dual formulation)
Find a couple of Kantorovich potentials (u, v) ∈ Π(c) maximizing
max P(u, v).
Π(c)
v(y) dν(y)
Y
Introduction
The discrete case
Measures
The Euclidean case
The general problem
Problem
Given two Borel probability measures µ ∈ P(X) and ν ∈ P(Y ) find an
admissible trasnference plan γ ∈ Γ(µ, ν) minimizing the toal cost
min
γ ∈Γ(µ,ν)
C(γ)
Kantorovich potentials: functions u : X → R, v : Y → R such that
v(y) − u(x) ≤ c(x, y)
X
x
u(x)m(x) +
X
Z
v(y)n(y)
(Π(c))
Z
P(u, v) :=
u(x) dµ(x) +
X
y
Problem (Dual formulation)
Find a couple of Kantorovich potentials (u, v) ∈ Π(c) maximizing
max P(u, v).
Π(c)
v(y) dν(y)
Y
Introduction
The discrete case
Measures
The Euclidean case
A foundamental theorem
Assume that the cost is continuous and feasible, e.g.
ZZ
C(µ ⊗ ν) =
c(x, y) d(µ ⊗ ν)(x, y) < +∞
(sufficient feasibility codition)
X×Y
Theorem
Existence There rexists an optimal transference plan γ opt ∈ Γ(µ, ν) and a
couple of optimal Kantorovich potentials (uopt , vopt ) ∈ Π(c).
Duality
C(γ opt ) =
min C(γ) = max P(u, v) = P(uopt , vopt ).
Γ(µ,ν)
Slackness For every (x, y) ∈ supp(γ) (
Π(c)
connection by a transport ray)
c(x, y) = vopt (y) − uopt (x).
Cyclical monotonicity For every (x1 , y1 ), (x2 , y2 ), · · · , (xN , yN ) in the support
of γ and every permutation σ : {1, 2, · · · N } → {1, 2, · · · , N }
c(x1 , y1 ) + · · · + c(xN , yN ) ≤ c(x1 , yσ(1) ) + · · · c(xN , yσ(N ) ).
20
Introduction
The discrete case
Measures
The Euclidean case
A foundamental theorem
Assume that the cost is continuous and feasible, e.g.
ZZ
C(µ ⊗ ν) =
c(x, y) d(µ ⊗ ν)(x, y) < +∞
(sufficient feasibility codition)
X×Y
Theorem
Existence There rexists an optimal transference plan γ opt ∈ Γ(µ, ν) and a
couple of optimal Kantorovich potentials (uopt , vopt ) ∈ Π(c).
Duality
C(γ opt ) =
min C(γ) = max P(u, v) = P(uopt , vopt ).
Γ(µ,ν)
Slackness For every (x, y) ∈ supp(γ) (
Π(c)
connection by a transport ray)
c(x, y) = vopt (y) − uopt (x).
Cyclical monotonicity For every (x1 , y1 ), (x2 , y2 ), · · · , (xN , yN ) in the support
of γ and every permutation σ : {1, 2, · · · N } → {1, 2, · · · , N }
c(x1 , y1 ) + · · · + c(xN , yN ) ≤ c(x1 , yσ(1) ) + · · · c(xN , yσ(N ) ).
20
Introduction
The discrete case
Measures
The Euclidean case
A foundamental theorem
Assume that the cost is continuous and feasible, e.g.
ZZ
C(µ ⊗ ν) =
c(x, y) d(µ ⊗ ν)(x, y) < +∞
(sufficient feasibility codition)
X×Y
Theorem
Existence There rexists an optimal transference plan γ opt ∈ Γ(µ, ν) and a
couple of optimal Kantorovich potentials (uopt , vopt ) ∈ Π(c).
Duality
C(γ opt ) =
min C(γ) = max P(u, v) = P(uopt , vopt ).
Γ(µ,ν)
Slackness For every (x, y) ∈ supp(γ) (
Π(c)
connection by a transport ray)
c(x, y) = vopt (y) − uopt (x).
Cyclical monotonicity For every (x1 , y1 ), (x2 , y2 ), · · · , (xN , yN ) in the support
of γ and every permutation σ : {1, 2, · · · N } → {1, 2, · · · , N }
c(x1 , y1 ) + · · · + c(xN , yN ) ≤ c(x1 , yσ(1) ) + · · · c(xN , yσ(N ) ).
20
Introduction
The discrete case
Measures
The Euclidean case
A foundamental theorem
Assume that the cost is continuous and feasible, e.g.
ZZ
C(µ ⊗ ν) =
c(x, y) d(µ ⊗ ν)(x, y) < +∞
(sufficient feasibility codition)
X×Y
Theorem
Existence There rexists an optimal transference plan γ opt ∈ Γ(µ, ν) and a
couple of optimal Kantorovich potentials (uopt , vopt ) ∈ Π(c).
Duality
C(γ opt ) =
min C(γ) = max P(u, v) = P(uopt , vopt ).
Γ(µ,ν)
Slackness For every (x, y) ∈ supp(γ) (
Π(c)
connection by a transport ray)
c(x, y) = vopt (y) − uopt (x).
Cyclical monotonicity For every (x1 , y1 ), (x2 , y2 ), · · · , (xN , yN ) in the support
of γ and every permutation σ : {1, 2, · · · N } → {1, 2, · · · , N }
c(x1 , y1 ) + · · · + c(xN , yN ) ≤ c(x1 , yσ(1) ) + · · · c(xN , yσ(N ) ).
20
Introduction
The discrete case
Measures
The Euclidean case
A foundamental theorem
Assume that the cost is continuous and feasible, e.g.
ZZ
C(µ ⊗ ν) =
c(x, y) d(µ ⊗ ν)(x, y) < +∞
(sufficient feasibility codition)
X×Y
Theorem
Existence There rexists an optimal transference plan γ opt ∈ Γ(µ, ν) and a
couple of optimal Kantorovich potentials (uopt , vopt ) ∈ Π(c).
Duality
C(γ opt ) =
min C(γ) = max P(u, v) = P(uopt , vopt ).
Γ(µ,ν)
Slackness For every (x, y) ∈ supp(γ) (
Π(c)
connection by a transport ray)
c(x, y) = vopt (y) − uopt (x).
Cyclical monotonicity For every (x1 , y1 ), (x2 , y2 ), · · · , (xN , yN ) in the support
of γ and every permutation σ : {1, 2, · · · N } → {1, 2, · · · , N }
c(x1 , y1 ) + · · · + c(xN , yN ) ≤ c(x1 , yσ(1) ) + · · · c(xN , yσ(N ) ).
20
Introduction
The discrete case
Measures
The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
21
Introduction
The discrete case
Measures
The Euclidean case
Some important questions
I
Uniqueness of the optimal transference plan
I
Integrality
I
Links with the geometry: the cost function (x, y) depends on the distance
between x and y (|x − y| when X = Y = Rd )
I
I
existence of a transport map.
Regularity of Kantorovich potentials
Further information when the measures µ = f L d L d and
ν = gL d L d are absolutely continuous with respect to the
Lebesgue measure:
Z
Z
µ(A) =
f (x) dx, ν(B) =
g(y) dy.
A
B
All these questions are strictly linked!
From now on we will consider the Euclidean case X = Y = Rd .
Introduction
The discrete case
Measures
The Euclidean case
Some important questions
I
Uniqueness of the optimal transference plan
I
Integrality
I
Links with the geometry: the cost function (x, y) depends on the distance
between x and y (|x − y| when X = Y = Rd )
I
Regularity of Kantorovich potentials
I
existence of a transport map.
Further information when the measures µ = f L d L d and
ν = gL d L d are absolutely continuous with respect to the
Lebesgue measure:
Z
Z
µ(A) =
f (x) dx, ν(B) =
g(y) dy.
A
B
All these questions are strictly linked!
From now on we will consider the Euclidean case X = Y = Rd .
22
Introduction
The discrete case
Measures
The Euclidean case
Some important questions
I
Uniqueness of the optimal transference plan
I
Integrality
I
Links with the geometry: the cost function (x, y) depends on the distance
between x and y (|x − y| when X = Y = Rd )
I
Regularity of Kantorovich potentials
I
existence of a transport map.
Further information when the measures µ = f L d L d and
ν = gL d L d are absolutely continuous with respect to the
Lebesgue measure:
Z
Z
µ(A) =
f (x) dx, ν(B) =
g(y) dy.
A
B
All these questions are strictly linked!
From now on we will consider the Euclidean case X = Y = Rd .
22
Introduction
The discrete case
Measures
The Euclidean case
Some important questions
I
Uniqueness of the optimal transference plan
I
Integrality
I
Links with the geometry: the cost function (x, y) depends on the distance
between x and y (|x − y| when X = Y = Rd )
I
Regularity of Kantorovich potentials
I
existence of a transport map.
Further information when the measures µ = f L d L d and
ν = gL d L d are absolutely continuous with respect to the
Lebesgue measure:
Z
Z
µ(A) =
f (x) dx, ν(B) =
g(y) dy.
A
B
All these questions are strictly linked!
From now on we will consider the Euclidean case X = Y = Rd .
22
Introduction
The discrete case
Measures
The Euclidean case
Some important questions
I
Uniqueness of the optimal transference plan
I
Integrality
I
Links with the geometry: the cost function (x, y) depends on the distance
between x and y (|x − y| when X = Y = Rd )
I
Regularity of Kantorovich potentials
I
existence of a transport map.
Further information when the measures µ = f L d L d and
ν = gL d L d are absolutely continuous with respect to the
Lebesgue measure:
Z
Z
µ(A) =
f (x) dx, ν(B) =
g(y) dy.
A
B
All these questions are strictly linked!
From now on we will consider the Euclidean case X = Y = Rd .
22
Introduction
The discrete case
Measures
The Euclidean case
Some important questions
I
Uniqueness of the optimal transference plan
I
Integrality
I
Links with the geometry: the cost function (x, y) depends on the distance
between x and y (|x − y| when X = Y = Rd )
I
Regularity of Kantorovich potentials
I
existence of a transport map.
Further information when the measures µ = f L d L d and
ν = gL d L d are absolutely continuous with respect to the
Lebesgue measure:
Z
Z
µ(A) =
f (x) dx, ν(B) =
g(y) dy.
A
B
All these questions are strictly linked!
From now on we will consider the Euclidean case X = Y = Rd .
22
Introduction
The discrete case
Measures
The Euclidean case
Integrality and transport maps
At the continuous level the integrality condition could be informally stated by
asking that (almost) every point x is the starting point of at most one
transport ray.
We can say that y is connected to x by a transport ray if (x, y) ∈ supp γ; thus we
have
(x, y 1 ), (x, y 2 ) ∈ supp γ ⇒ y1 = y2 =: t(x)
a property which should hold µ-almost everywhere.
t : X → Y is called transport map induced by the plan γ. It satisfies
if A = t−1 (B) then µ(A) = ν(B) = γ(A × B).
Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t is
differentiable
Z
Z
Z
µ(A) =
f (x) dx = ν(B) =
g(y) dy =
g(t(x))| det Dt(x)| dx
A
B
A
so that
f (x) = g(t(x))| det Dt(x)|.
23
Introduction
The discrete case
Measures
The Euclidean case
Integrality and transport maps
At the continuous level the integrality condition could be informally stated by
asking that (almost) every point x is the starting point of at most one
transport ray.
We can say that y is connected to x by a transport ray if (x, y) ∈ supp γ; thus we
have
(x, y 1 ), (x, y 2 ) ∈ supp γ ⇒ y1 = y2 =: t(x)
a property which should hold µ-almost everywhere.
t : X → Y is called transport map induced by the plan γ. It satisfies
if A = t−1 (B) then µ(A) = ν(B) = γ(A × B).
Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t is
differentiable
Z
Z
Z
µ(A) =
f (x) dx = ν(B) =
g(y) dy =
g(t(x))| det Dt(x)| dx
A
B
A
so that
f (x) = g(t(x))| det Dt(x)|.
23
Introduction
The discrete case
Measures
The Euclidean case
Integrality and transport maps
At the continuous level the integrality condition could be informally stated by
asking that (almost) every point x is the starting point of at most one
transport ray.
We can say that y is connected to x by a transport ray if (x, y) ∈ supp γ; thus we
have
(x, y 1 ), (x, y 2 ) ∈ supp γ ⇒ y1 = y2 =: t(x)
a property which should hold µ-almost everywhere.
t : X → Y is called transport map induced by the plan γ. It satisfies
if A = t−1 (B) then µ(A) = ν(B) = γ(A × B).
Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t is
differentiable
Z
Z
Z
µ(A) =
f (x) dx = ν(B) =
g(y) dy =
g(t(x))| det Dt(x)| dx
A
B
A
so that
f (x) = g(t(x))| det Dt(x)|.
23
Introduction
The discrete case
Measures
The Euclidean case
Integrality and transport maps
At the continuous level the integrality condition could be informally stated by
asking that (almost) every point x is the starting point of at most one
transport ray.
We can say that y is connected to x by a transport ray if (x, y) ∈ supp γ; thus we
have
(x, y 1 ), (x, y 2 ) ∈ supp γ ⇒ y1 = y2 =: t(x)
a property which should hold µ-almost everywhere.
t : X → Y is called transport map induced by the plan γ. It satisfies
if A = t−1 (B) then µ(A) = ν(B) = γ(A × B).
Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t is
differentiable
Z
Z
Z
µ(A) =
f (x) dx = ν(B) =
g(y) dy =
g(t(x))| det Dt(x)| dx
A
B
A
so that
f (x) = g(t(x))| det Dt(x)|.
23
Introduction
The discrete case
Measures
The Euclidean case
Existence and uniqueness of the optimal transport map:
c(x, y) = 12 |x − y|2
Theorem (Brenier (1989))
Siano µ = f dx, ν = g dy, c(x, y) :=
1
|x
2
− y|2
I
There exists a unique optimal transference plan γ and it is associated to a
transport map t.
I
The Kantorovich potentials are perturbations of convex functions; more
precisely
1
|x|2 + u(x) = φ(x)
2
and
1 2
|y| − v(y) = ψ(y)
2
are convex
and ψ is the Legendre transform of φ
ψ(y) = φ∗ (y) = sup hy, xi − φ(x).
x
I
t(x) = ∇φ(x) = x − ∇u(x) is the gradient of a convex function, it is
essentially injective, a.e. differentiable, differenziabile, and Dt = D2 φ is
positive definite.
I
φ solves Monge-Ampére equation
det D2 φ(x) =
f (x)
g(∇φ(x))
24
Introduction
The discrete case
Measures
The Euclidean case
Existence and uniqueness of the optimal transport map:
c(x, y) = 12 |x − y|2
Theorem (Brenier (1989))
Siano µ = f dx, ν = g dy, c(x, y) :=
1
|x
2
− y|2
I
There exists a unique optimal transference plan γ and it is associated to a
transport map t.
I
The Kantorovich potentials are perturbations of convex functions; more
precisely
1
|x|2 + u(x) = φ(x)
2
and
1 2
|y| − v(y) = ψ(y)
2
are convex
and ψ is the Legendre transform of φ
ψ(y) = φ∗ (y) = sup hy, xi − φ(x).
x
I
t(x) = ∇φ(x) = x − ∇u(x) is the gradient of a convex function, it is
essentially injective, a.e. differentiable, differenziabile, and Dt = D2 φ is
positive definite.
I
φ solves Monge-Ampére equation
det D2 φ(x) =
f (x)
g(∇φ(x))
24
Introduction
The discrete case
Measures
The Euclidean case
Existence and uniqueness of the optimal transport map:
c(x, y) = 12 |x − y|2
Theorem (Brenier (1989))
Siano µ = f dx, ν = g dy, c(x, y) :=
1
|x
2
− y|2
I
There exists a unique optimal transference plan γ and it is associated to a
transport map t.
I
The Kantorovich potentials are perturbations of convex functions; more
precisely
1
|x|2 + u(x) = φ(x)
2
and
1 2
|y| − v(y) = ψ(y)
2
are convex
and ψ is the Legendre transform of φ
ψ(y) = φ∗ (y) = sup hy, xi − φ(x).
x
I
t(x) = ∇φ(x) = x − ∇u(x) is the gradient of a convex function, it is
essentially injective, a.e. differentiable, differenziabile, and Dt = D2 φ is
positive definite.
I
φ solves Monge-Ampére equation
det D2 φ(x) =
f (x)
g(∇φ(x))
24
Introduction
The discrete case
Measures
The Euclidean case
Existence and uniqueness of the optimal transport map:
c(x, y) = 12 |x − y|2
Theorem (Brenier (1989))
Siano µ = f dx, ν = g dy, c(x, y) :=
1
|x
2
− y|2
I
There exists a unique optimal transference plan γ and it is associated to a
transport map t.
I
The Kantorovich potentials are perturbations of convex functions; more
precisely
1
|x|2 + u(x) = φ(x)
2
and
1 2
|y| − v(y) = ψ(y)
2
are convex
and ψ is the Legendre transform of φ
ψ(y) = φ∗ (y) = sup hy, xi − φ(x).
x
I
t(x) = ∇φ(x) = x − ∇u(x) is the gradient of a convex function, it is
essentially injective, a.e. differentiable, differenziabile, and Dt = D2 φ is
positive definite.
I
φ solves Monge-Ampére equation
det D2 φ(x) =
f (x)
g(∇φ(x))
24
Introduction
The discrete case
Measures
The Euclidean case
Existence and uniqueness of the optimal transport map:
c(x, y) = 12 |x − y|2
Theorem (Brenier (1989))
Siano µ = f dx, ν = g dy, c(x, y) :=
1
|x
2
− y|2
I
There exists a unique optimal transference plan γ and it is associated to a
transport map t.
I
The Kantorovich potentials are perturbations of convex functions; more
precisely
1
|x|2 + u(x) = φ(x)
2
and
1 2
|y| − v(y) = ψ(y)
2
are convex
and ψ is the Legendre transform of φ
ψ(y) = φ∗ (y) = sup hy, xi − φ(x).
x
I
t(x) = ∇φ(x) = x − ∇u(x) is the gradient of a convex function, it is
essentially injective, a.e. differentiable, differenziabile, and Dt = D2 φ is
positive definite.
I
φ solves Monge-Ampére equation
det D2 φ(x) =
f (x)
g(∇φ(x))
Introduction
The discrete case
Measures
The Euclidean case
Brenier theorem
µ = f dx, ν = g dx are absolutely continuous in Rd .
Rd
The optimal coupling γ ∈ Γo (µ, ν) is
concentrated on the graph of a
ν
cyclically monotone map t:
γ
γ = (i × t)# µ
Z
|x − t(x)|2 dµ(x)
W2 (µ, ν) =
R
ν
d
µ
µ
Rd
t can be recovered by the optimal Kantorovich potentials u − v satisfying
Z
Z
v(y) − u(x) ≤ |x − y|2 , W22 (µ, ν) =
v(y) dν(y) −
u(x) dµ(x)
by
t(x) = x + ∇u(x) = ∇
“1
2
”
|x|2 + u(x) ,
1
|x|2 + u(x)
2
is convex.
25
Introduction
The discrete case
Measures
The Euclidean case
Brenier theorem
µ = f dx, ν = g dx are absolutely continuous in Rd .
Rd
The optimal coupling γ ∈ Γo (µ, ν) is
concentrated on the graph of a
ν
cyclically monotone map t:
t
γ = (i × t)# µ
Z
|x − t(x)|2 dµ(x)
W2 (µ, ν) =
R
ν
d
µ
µ
Rd
t can be recovered by the optimal Kantorovich potentials u − v satisfying
Z
Z
v(y) − u(x) ≤ |x − y|2 , W22 (µ, ν) =
v(y) dν(y) −
u(x) dµ(x)
by
t(x) = x + ∇u(x) = ∇
“1
2
”
|x|2 + u(x) ,
1
|x|2 + u(x)
2
is convex.
25
Introduction
The discrete case
Measures
The Euclidean case
Brenier theorem
µ = f dx, ν = g dx are absolutely continuous in Rd .
Rd
The optimal coupling γ ∈ Γo (µ, ν) is
concentrated on the graph of a
ν
cyclically monotone map t:
t
γ = (i × t)# µ
Z
|x − t(x)|2 dµ(x)
W2 (µ, ν) =
R
ν
d
µ
µ
Rd
t can be recovered by the optimal Kantorovich potentials u − v satisfying
Z
Z
v(y) − u(x) ≤ |x − y|2 , W22 (µ, ν) =
v(y) dν(y) −
u(x) dµ(x)
by
t(x) = x + ∇u(x) = ∇
“1
2
”
|x|2 + u(x) ,
1
|x|2 + u(x)
2
is convex.
25
Introduction
The discrete case
Measures
The Euclidean case
Brenier theorem
µ = f dx, ν = g dx are absolutely continuous in Rd .
Rd
The optimal coupling γ ∈ Γo (µ, ν) is
concentrated on the graph of a
ν
cyclically monotone map t:
t
γ = (i × t)# µ
Z
|x − t(x)|2 dµ(x)
W2 (µ, ν) =
R
ν
d
µ
µ
Rd
t can be recovered by the optimal Kantorovich potentials u − v satisfying
Z
Z
v(y) − u(x) ≤ |x − y|2 , W22 (µ, ν) =
v(y) dν(y) −
u(x) dµ(x)
by
t(x) = x + ∇u(x) = ∇
“1
2
”
|x|2 + u(x) ,
1
|x|2 + u(x)
2
is convex.
25
Introduction
The discrete case
Measures
The Euclidean case
Extensions and applications
I
Strictly convex costs c(x, y) = h(|x − y|): Gangbo-McCann,. . . (’96-)
I
Monge problem c(x, y) = |x − y|: Sudakov (’79), Ambrosio (2000),. . . ,
Bianchini, Champion-De Pascale,. . .
I
Regularity: (Caffarelli,. . . (’92-), Wang, Trudinger, Loeper, Villani,
McCann,)
I
Isoperimetric and functional inequalities: Gromov, Villani, Otto,
McCann, Maggi, Figalli, Pratelli, . . .
I
Hilbert and Wiener spaces: Feyel-Ustunel, Ambrosio-Gigli-S., (’04-), . . .
I
Riemannian manifold, Ricci flow: McCann, Sturm, Villani, Lott,
Topping, Carfora . . . (’98-))
I
...
Introduction
The discrete case
Measures
The Euclidean case
A distance between probability measures
The quadratic cost c(x, y) = |x − y|2 induces a distance between probability
measures with finite quadratic moment (P2 (Rd )): the so-called
Kantorovich-Rubinstein-Wasserstein distance
ZZ
“
”1/2 “
”1/2
W2 (µ, ν) := C(µ, ν)
=
min
|x − y|2 dγ(x, y)
γ ∈Γ(µ,ν )
This distance has a simple interpretation in the case of discrete measures: if
N
N
1 X
1 X
δxk e ν =
δy allora
µ=
N k=1
N k=1 k
W22 (µ, ν) = min
σ
N
1 X
|xk − yσ(k) |2 ,
N k=1
σ permutation of {1, 2, · · · , N }
P2 (Rd ), W2 is a complete and separable metric space, the distance W2 is
associated to the weak convergence of measures:
8Z
Z
>
<
ζ(x) dµn (x) →
ζ(x) dµ(x)
W2 (µn , µ) → 0 ⇔
>
: per ogni ζ ∈ C 0 (Rd ), |ζ(x)| ≤ A|x|2 + B.
27
Introduction
The discrete case
Measures
The Euclidean case
A distance between probability measures
The quadratic cost c(x, y) = |x − y|2 induces a distance between probability
measures with finite quadratic moment (P2 (Rd )): the so-called
Kantorovich-Rubinstein-Wasserstein distance
ZZ
“
”1/2 “
”1/2
W2 (µ, ν) := C(µ, ν)
=
min
|x − y|2 dγ(x, y)
γ ∈Γ(µ,ν )
This distance has a simple interpretation in the case of discrete measures: if
N
N
1 X
1 X
µ=
δxk e ν =
δy allora
N k=1
N k=1 k
W22 (µ, ν) = min
σ
N
1 X
|xk − yσ(k) |2 ,
N k=1
σ permutation of {1, 2, · · · , N }
P2 (Rd ), W2 is a complete and separable metric space, the distance W2 is
associated to the weak convergence of measures:
8Z
Z
>
<
ζ(x) dµn (x) →
ζ(x) dµ(x)
W2 (µn , µ) → 0 ⇔
>
: per ogni ζ ∈ C 0 (Rd ), |ζ(x)| ≤ A|x|2 + B.
27
Introduction
The discrete case
Measures
The Euclidean case
A distance between probability measures
The quadratic cost c(x, y) = |x − y|2 induces a distance between probability
measures with finite quadratic moment (P2 (Rd )): the so-called
Kantorovich-Rubinstein-Wasserstein distance
ZZ
“
”1/2 “
”1/2
W2 (µ, ν) := C(µ, ν)
=
min
|x − y|2 dγ(x, y)
γ ∈Γ(µ,ν )
This distance has a simple interpretation in the case of discrete measures: if
N
N
1 X
1 X
µ=
δxk e ν =
δy allora
N k=1
N k=1 k
W22 (µ, ν) = min
σ
N
1 X
|xk − yσ(k) |2 ,
N k=1
σ permutation of {1, 2, · · · , N }
P2 (Rd ), W2 is a complete and separable metric space, the distance W2 is
associated to the weak convergence of measures:
8Z
Z
>
<
ζ(x) dµn (x) →
ζ(x) dµ(x)
W2 (µn , µ) → 0 ⇔
>
: per ogni ζ ∈ C 0 (Rd ), |ζ(x)| ≤ A|x|2 + B.
27
Introduction
The discrete case
Measures
The Euclidean case
Weak convergence, lower semicontinuity, and compactness
Definition (Weak convergence)
A sequence µn ∈ P(Rm ) converges weakly to µ ∈ P(Rm ) if
Z
Z
lim
ϕ(x) dµn (x) =
ϕ(x) dµ(x) ∀ϕ ∈ Cb0 (Rd )
n→+∞
I
Rm
Rm
Test functions ϕ can be equivalently choosen in Cc0 (Rd ) or in Cc∞ (Rd ), as for
distributional convergence.
I
If Xn → X pointwise, then (Xn )# P * X# P.
I
If ζ : Rd → [0, +∞] is just lower semicontinuous (no boundedness is
required) and µn * µ then
Z
Z
lim inf
ζ(x) dµn (x) ≥
ζ(x) dµ(x).
n→+∞
I
Rd
Rd
Prokhorov Theorem: A set Γ ⊂ P(Rd ) is weakly relatively compact
iff it is tight, i.e.
for every ε > 0 there exists a compact set K b Rd :
µ(Rd \ K) ≤ ε
∀ µ ∈ Γ.
28
Introduction
The discrete case
Measures
The Euclidean case
Weak convergence, lower semicontinuity, and compactness
Definition (Weak convergence)
A sequence µn ∈ P(Rm ) converges weakly to µ ∈ P(Rm ) if
Z
Z
lim
ϕ(x) dµn (x) =
ϕ(x) dµ(x) ∀ϕ ∈ Cb0 (Rd )
n→+∞
I
Rm
Rm
Test functions ϕ can be equivalently choosen in Cc0 (Rd ) or in Cc∞ (Rd ), as for
distributional convergence.
I
If Xn → X pointwise, then (Xn )# P * X# P.
I
If ζ : Rd → [0, +∞] is just lower semicontinuous (no boundedness is
required) and µn * µ then
Z
Z
lim inf
ζ(x) dµn (x) ≥
ζ(x) dµ(x).
n→+∞
I
Rd
Rd
Prokhorov Theorem: A set Γ ⊂ P(Rd ) is weakly relatively compact
iff it is tight, i.e.
for every ε > 0 there exists a compact set K b Rd :
µ(Rd \ K) ≤ ε
∀ µ ∈ Γ.
28
Introduction
The discrete case
Measures
The Euclidean case
Weak convergence, lower semicontinuity, and compactness
Definition (Weak convergence)
A sequence µn ∈ P(Rm ) converges weakly to µ ∈ P(Rm ) if
Z
Z
lim
ϕ(x) dµn (x) =
ϕ(x) dµ(x) ∀ϕ ∈ Cb0 (Rd )
n→+∞
I
Rm
Rm
Test functions ϕ can be equivalently choosen in Cc0 (Rd ) or in Cc∞ (Rd ), as for
distributional convergence.
I
If Xn → X pointwise, then (Xn )# P * X# P.
I
If ζ : Rd → [0, +∞] is just lower semicontinuous (no boundedness is
required) and µn * µ then
Z
Z
lim inf
ζ(x) dµn (x) ≥
ζ(x) dµ(x).
n→+∞
I
Rd
Rd
Prokhorov Theorem: A set Γ ⊂ P(Rd ) is weakly relatively compact
iff it is tight, i.e.
for every ε > 0 there exists a compact set K b Rd :
µ(Rd \ K) ≤ ε
∀ µ ∈ Γ.
28
Introduction
The discrete case
Measures
The Euclidean case
Weak convergence, lower semicontinuity, and compactness
Definition (Weak convergence)
A sequence µn ∈ P(Rm ) converges weakly to µ ∈ P(Rm ) if
Z
Z
lim
ϕ(x) dµn (x) =
ϕ(x) dµ(x) ∀ϕ ∈ Cb0 (Rd )
n→+∞
I
Rm
Rm
Test functions ϕ can be equivalently choosen in Cc0 (Rd ) or in Cc∞ (Rd ), as for
distributional convergence.
I
If Xn → X pointwise, then (Xn )# P * X# P.
I
If ζ : Rd → [0, +∞] is just lower semicontinuous (no boundedness is
required) and µn * µ then
Z
Z
lim inf
ζ(x) dµn (x) ≥
ζ(x) dµ(x).
n→+∞
I
Rd
Rd
Prokhorov Theorem: A set Γ ⊂ P(Rd ) is weakly relatively compact
iff it is tight, i.e.
for every ε > 0 there exists a compact set K b Rd :
µ(Rd \ K) ≤ ε
∀ µ ∈ Γ.
28
Introduction
The discrete case
Measures
The Euclidean case
Optimal couplings and triangular inequality
Lower semicontinuity and tightness: the minimum problem
nZ
o
W22 (µ1 , µ2 ) := min
|x1 − x2 |2 dµ(x1 , x2 ) : µ ∈ Γ(µ1 , µ2 )
m
R
×Rm
is attained: Γo (µ1 , µ2 ) denotes the collection (closed, convex set) of all the
optimal couplings in P2 (Rm × Rm ). In general more than one optimal coupling
could exist.
Connecting a sequence of measures, disintegration and Kolmogorov
theorem:
if µ1,2 ∈ Γo (µ1 , µ2 ), µ2,3 ∈ Γo (µ2 , µ3 ), · · · , µj,j+1 ∈ Γo (µj , µj+1 ) then there
exists a probability measure P and random variables X1 , X2 , X3 , · · · , Xj , Xj+1 , · · ·
such that µ1,2 = (X1 , X2 )# P, · · · , µj,j+1 = (Xj , Xj+1 )# P.
In particular
ˆ
˜
W22 (µj , µj+1 ) = E |Xj − Xj+1 |2
(Xh , Xk )# P ∈ Γ(µh , µk ) but it is not optimal in general
if h, k are not consecutive.
Application: W2 is a distance, triangular inequality.
W2 (µ1 , µ3 ) ≤ W2 (µ1 , µ2 ) + W2 (µ2 , µ3 )
“ ˆ
˜”1/2 “ ˆ
˜”1/2
W2 (µ1 , µ3 ) ≤ E |X1 − X3 |2
= E |(X1 − X2 ) + (X2 − X3 )|2
“ ˆ
˜”1/2 “ ˆ
˜”1/2
≤ E |X1 − X2 |2
+ E |X2 − X3 |2
= W2 (µ1 , µ2 ) + W2 (µ2 , µ3 )
Introduction
The discrete case
Measures
The Euclidean case
Optimal couplings and triangular inequality
Lower semicontinuity and tightness: the minimum problem
nZ
o
W22 (µ1 , µ2 ) := min
|x1 − x2 |2 dµ(x1 , x2 ) : µ ∈ Γ(µ1 , µ2 )
m
R
×Rm
is attained: Γo (µ1 , µ2 ) denotes the collection (closed, convex set) of all the
optimal couplings in P2 (Rm × Rm ). In general more than one optimal coupling
could exist.
Connecting a sequence of measures, disintegration and Kolmogorov
theorem:
if µ1,2 ∈ Γo (µ1 , µ2 ), µ2,3 ∈ Γo (µ2 , µ3 ), · · · , µj,j+1 ∈ Γo (µj , µj+1 ) then there
exists a probability measure P and random variables X1 , X2 , X3 , · · · , Xj , Xj+1 , · · ·
such that µ1,2 = (X1 , X2 )# P, · · · , µj,j+1 = (Xj , Xj+1 )# P.
In particular
ˆ
˜
W22 (µj , µj+1 ) = E |Xj − Xj+1 |2
(Xh , Xk )# P ∈ Γ(µh , µk ) but it is not optimal in general
if h, k are not consecutive.
Application: W2 is a distance, triangular inequality.
W2 (µ1 , µ3 ) ≤ W2 (µ1 , µ2 ) + W2 (µ2 , µ3 )
“ ˆ
˜”1/2 “ ˆ
˜”1/2
W2 (µ1 , µ3 ) ≤ E |X1 − X3 |2
= E |(X1 − X2 ) + (X2 − X3 )|2
“ ˆ
˜”1/2 “ ˆ
˜”1/2
≤ E |X1 − X2 |2
+ E |X2 − X3 |2
= W2 (µ1 , µ2 ) + W2 (µ2 , µ3 )
29
Introduction
The discrete case
Measures
The Euclidean case
Optimal couplings and triangular inequality
Lower semicontinuity and tightness: the minimum problem
nZ
o
W22 (µ1 , µ2 ) := min
|x1 − x2 |2 dµ(x1 , x2 ) : µ ∈ Γ(µ1 , µ2 )
m
R
×Rm
is attained: Γo (µ1 , µ2 ) denotes the collection (closed, convex set) of all the
optimal couplings in P2 (Rm × Rm ). In general more than one optimal coupling
could exist.
Connecting a sequence of measures, disintegration and Kolmogorov
theorem:
if µ1,2 ∈ Γo (µ1 , µ2 ), µ2,3 ∈ Γo (µ2 , µ3 ), · · · , µj,j+1 ∈ Γo (µj , µj+1 ) then there
exists a probability measure P and random variables X1 , X2 , X3 , · · · , Xj , Xj+1 , · · ·
such that µ1,2 = (X1 , X2 )# P, · · · , µj,j+1 = (Xj , Xj+1 )# P.
In particular
ˆ
˜
W22 (µj , µj+1 ) = E |Xj − Xj+1 |2
(Xh , Xk )# P ∈ Γ(µh , µk ) but it is not optimal in general
if h, k are not consecutive.
Application: W2 is a distance, triangular inequality.
W2 (µ1 , µ3 ) ≤ W2 (µ1 , µ2 ) + W2 (µ2 , µ3 )
“ ˆ
˜”1/2 “ ˆ
˜”1/2
W2 (µ1 , µ3 ) ≤ E |X1 − X3 |2
= E |(X1 − X2 ) + (X2 − X3 )|2
“ ˆ
˜”1/2 “ ˆ
˜”1/2
≤ E |X1 − X2 |2
+ E |X2 − X3 |2
= W2 (µ1 , µ2 ) + W2 (µ2 , µ3 )
29
Introduction
The discrete case
Measures
The Euclidean case
Optimal couplings and triangular inequality
Lower semicontinuity and tightness: the minimum problem
nZ
o
W22 (µ1 , µ2 ) := min
|x1 − x2 |2 dµ(x1 , x2 ) : µ ∈ Γ(µ1 , µ2 )
m
R
×Rm
is attained: Γo (µ1 , µ2 ) denotes the collection (closed, convex set) of all the
optimal couplings in P2 (Rm × Rm ). In general more than one optimal coupling
could exist.
Connecting a sequence of measures, disintegration and Kolmogorov
theorem:
if µ1,2 ∈ Γo (µ1 , µ2 ), µ2,3 ∈ Γo (µ2 , µ3 ), · · · , µj,j+1 ∈ Γo (µj , µj+1 ) then there
exists a probability measure P and random variables X1 , X2 , X3 , · · · , Xj , Xj+1 , · · ·
such that µ1,2 = (X1 , X2 )# P, · · · , µj,j+1 = (Xj , Xj+1 )# P.
In particular
ˆ
˜
W22 (µj , µj+1 ) = E |Xj − Xj+1 |2
(Xh , Xk )# P ∈ Γ(µh , µk ) but it is not optimal in general
if h, k are not consecutive.
Application: W2 is a distance, triangular inequality.
W2 (µ1 , µ3 ) ≤ W2 (µ1 , µ2 ) + W2 (µ2 , µ3 )
“ ˆ
˜”1/2 “ ˆ
˜”1/2
W2 (µ1 , µ3 ) ≤ E |X1 − X3 |2
= E |(X1 − X2 ) + (X2 − X3 )|2
“ ˆ
˜”1/2 “ ˆ
˜”1/2
+ E |X2 − X3 |2
= W2 (µ1 , µ2 ) + W2 (µ2 , µ3 )
≤ E |X1 − X2 |2
29
Introduction
The discrete case
Measures
The Euclidean case
Optimal couplings and triangular inequality
Lower semicontinuity and tightness: the minimum problem
nZ
o
W22 (µ1 , µ2 ) := min
|x1 − x2 |2 dµ(x1 , x2 ) : µ ∈ Γ(µ1 , µ2 )
m
R
×Rm
is attained: Γo (µ1 , µ2 ) denotes the collection (closed, convex set) of all the
optimal couplings in P2 (Rm × Rm ). In general more than one optimal coupling
could exist.
Connecting a sequence of measures, disintegration and Kolmogorov
theorem:
if µ1,2 ∈ Γo (µ1 , µ2 ), µ2,3 ∈ Γo (µ2 , µ3 ), · · · , µj,j+1 ∈ Γo (µj , µj+1 ) then there
exists a probability measure P and random variables X1 , X2 , X3 , · · · , Xj , Xj+1 , · · ·
such that µ1,2 = (X1 , X2 )# P, · · · , µj,j+1 = (Xj , Xj+1 )# P.
In particular
ˆ
˜
W22 (µj , µj+1 ) = E |Xj − Xj+1 |2
(Xh , Xk )# P ∈ Γ(µh , µk ) but it is not optimal in general
if h, k are not consecutive.
Application: W2 is a distance, triangular inequality.
W2 (µ1 , µ3 ) ≤ W2 (µ1 , µ2 ) + W2 (µ2 , µ3 )
“ ˆ
˜”1/2 “ ˆ
˜”1/2
W2 (µ1 , µ3 ) ≤ E |X1 − X3 |2
= E |(X1 − X2 ) + (X2 − X3 )|2
“ ˆ
˜”1/2 “ ˆ
˜”1/2
+ E |X2 − X3 |2
= W2 (µ1 , µ2 ) + W2 (µ2 , µ3 )
≤ E |X1 − X2 |2
29
Introduction
The discrete case
Measures
The Euclidean case
“Soft” properties
⇔
Weak convergence +
convergence of the quadratic
moments.
I
Convergence with respect to W
I
Completeness (if one considers all the probability measures in P2 (Rm )).
I
Lower semicontinuity with respect to weak/distributional convergence
I
Convexity (but linear segments are not geodesics!)
I
Existence of (constant speed, minimizing) geodesics connecting arbitrary
measures µ0 , µ1 : they are curves µ : t ∈ [0, 1] 7→ µt s.t.
W2 (µ0 , µ1 ) = L10 [µ],
W2 (µs , µt ) = |t − s| W2 (µ0 , µ1 ).
Introduction
The discrete case
Measures
The Euclidean case
“Soft” properties
⇔
Weak convergence +
convergence of the quadratic
moments.
I
Convergence with respect to W
I
Completeness (if one considers all the probability measures in P2 (Rm )).
I
Lower semicontinuity with respect to weak/distributional convergence
I
Convexity (but linear segments are not geodesics!)
I
Existence of (constant speed, minimizing) geodesics connecting arbitrary
measures µ0 , µ1 : they are curves µ : t ∈ [0, 1] 7→ µt s.t.
W2 (µ0 , µ1 ) = L10 [µ],
W2 (µs , µt ) = |t − s| W2 (µ0 , µ1 ).
30
Introduction
The discrete case
Measures
The Euclidean case
“Soft” properties
⇔
Weak convergence +
convergence of the quadratic
moments.
I
Convergence with respect to W
I
Completeness (if one considers all the probability measures in P2 (Rm )).
I
Lower semicontinuity with respect to weak/distributional convergence
I
Convexity (but linear segments are not geodesics!)
I
Existence of (constant speed, minimizing) geodesics connecting arbitrary
measures µ0 , µ1 : they are curves µ : t ∈ [0, 1] 7→ µt s.t.
W2 (µ0 , µ1 ) = L10 [µ],
W2 (µs , µt ) = |t − s| W2 (µ0 , µ1 ).
30
Introduction
The discrete case
Measures
The Euclidean case
“Soft” properties
⇔
Weak convergence +
convergence of the quadratic
moments.
I
Convergence with respect to W
I
Completeness (if one considers all the probability measures in P2 (Rm )).
I
Lower semicontinuity with respect to weak/distributional convergence
I
Convexity (but linear segments are not geodesics!)
I
Existence of (constant speed, minimizing) geodesics connecting arbitrary
measures µ0 , µ1 : they are curves µ : t ∈ [0, 1] 7→ µt s.t.
W2 (µ0 , µ1 ) = L10 [µ],
W2 (µs , µt ) = |t − s| W2 (µ0 , µ1 ).
30
Introduction
The discrete case
Measures
The Euclidean case
“Soft” properties
⇔
Weak convergence +
convergence of the quadratic
moments.
I
Convergence with respect to W
I
Completeness (if one considers all the probability measures in P2 (Rm )).
I
Lower semicontinuity with respect to weak/distributional convergence
I
Convexity (but linear segments are not geodesics!)
I
Existence of (constant speed, minimizing) geodesics connecting arbitrary
measures µ0 , µ1 : they are curves µ : t ∈ [0, 1] 7→ µt s.t.
W2 (µ0 , µ1 ) = L10 [µ],
W2 (µs , µt ) = |t − s| W2 (µ0 , µ1 ).