Download The structured condition number of a differentiable function between

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Mathematics of radio engineering wikipedia , lookup

Matrix calculus wikipedia , lookup

Transcript
The structured condition number of a differentiable function
between matrix manifolds
Vanni Noferini (University of Essex)
Joint work with
B. Arslan (TU Bursa)
F. Tisseur (U Manchester)
16/2/2017, 2gALN, Como (mufloun land)
Vanni Noferini
Structured conditioning of matrix maps
1 / 22
Motivation
Let a > 0. Let us build a symplectic matrix as follows:
a
e
0
a 0
X =
⇒ ln X =
0 e −a
0 −a
(Its matrix logarithm must be skew-Hamiltonian)
Vanni Noferini
Structured conditioning of matrix maps
2 / 22
Motivation
Let a > 0. Let us build a symplectic matrix as follows:
a
e
0
a 0
X =
⇒ ln X =
0 e −a
0 −a
(Its matrix logarithm must be skew-Hamiltonian)
A tiny (norm = kX k) perturbation in X ...
a
e
0
a 0
0 0
X =
⇒
ln
X
=
+
+ o()
0 e a + e −a
0 −a
0 e 2a
Vanni Noferini
Structured conditioning of matrix maps
2 / 22
Motivation
Let a > 0. Let us build a symplectic matrix as follows:
a
e
0
a 0
X =
⇒ ln X =
0 e −a
0 −a
(Its matrix logarithm must be skew-Hamiltonian)
A tiny (norm = kX k) perturbation in X ...
a
e
0
a 0
0 0
X =
⇒
ln
X
=
+
+ o()
0 e a + e −a
0 −a
0 e 2a
It cannot be worse than this, so the relative conditioning is
e 2a
a
Vanni Noferini
Structured conditioning of matrix maps
2 / 22
However...
ea 0
a 0
X =
⇒ ln X =
0 e −a
0 −a
Vanni Noferini
Structured conditioning of matrix maps
3 / 22
However...
ea 0
a 0
X =
⇒ ln X =
0 e −a
0 −a
If we enforce that the symplectic structure is preserved the worse that we
can do is much milder!
"
#
a
ae a
0 sinh(a)
e
e a
a 0
X =
⇒ ln X =
+
+ o()
0 e −a
0 −a
0
0
Vanni Noferini
Structured conditioning of matrix maps
3 / 22
However...
ea 0
a 0
X =
⇒ ln X =
0 e −a
0 −a
If we enforce that the symplectic structure is preserved the worse that we
can do is much milder!
"
#
a
ae a
0 sinh(a)
e
e a
a 0
X =
⇒ ln X =
+
+ o()
0 e −a
0 −a
0
0
The relative conditioning is
ea
e 2a
<
sinh(a)
a
( for average or large values of a!)
Vanni Noferini
Structured conditioning of matrix maps
3 / 22
Condition number and derivatives
Let K ∈ {R, C} and
f : Kn×n 7→ Kn×n
be K-Fréchet differentiable.
Vanni Noferini
Structured conditioning of matrix maps
4 / 22
Condition number and derivatives
Let K ∈ {R, C} and
f : Kn×n 7→ Kn×n
be K-Fréchet differentiable.
Define
kf (Y ) − f (X )k
.
→0 kY −X k≤
cond(f , X ) = lim
Vanni Noferini
sup
Structured conditioning of matrix maps
4 / 22
Condition number and derivatives
Let K ∈ {R, C} and
f : Kn×n 7→ Kn×n
be K-Fréchet differentiable.
Define
kf (Y ) − f (X )k
.
→0 kY −X k≤
cond(f , X ) = lim
sup
Let Kf (X ) be the Fréchet derivative of f at X . Then,
cond(f , X ) = kKf (X )k.
Vanni Noferini
Structured conditioning of matrix maps
4 / 22
Restricting the perturbation
Suppose now X ∈ M, a submanifold of Kn×n . Backward error analysis of a
structure preserving algorithm enforces a perturbation Y − X such that
E := Y − X ∈ M.
Vanni Noferini
Structured conditioning of matrix maps
5 / 22
Restricting the perturbation
Suppose now X ∈ M, a submanifold of Kn×n . Backward error analysis of a
structure preserving algorithm enforces a perturbation Y − X such that
E := Y − X ∈ M.
Structured condition number:
kf (Y ) − f (X )k
.
→0 kY −X k≤,Y −X ∈M
conds (f , X ) = lim
Vanni Noferini
sup
Structured conditioning of matrix maps
5 / 22
Restricting the perturbation
E
Vanni Noferini
=⇒
SE
Structured conditioning of matrix maps
6 / 22
Restricting the perturbation
=⇒
E
SE
SE
E
Hence, by definition of supremum,
conds (f , X ) ≤ cond(f , X ).
Vanni Noferini
Structured conditioning of matrix maps
6 / 22
Analyzing the condition number
Two cases:
1
If M is a vector subspace (a flat manifold), one just needs to project
the Fréchet derivative appropriately [Davies ’04]
Vanni Noferini
Structured conditioning of matrix maps
7 / 22
Analyzing the condition number
Two cases:
1
If M is a vector subspace (a flat manifold), one just needs to project
the Fréchet derivative appropriately [Davies ’04]
2
If not, to do the same we need a few tools from differential geometry.
Vanni Noferini
Structured conditioning of matrix maps
7 / 22
Analyzing the condition number
Two cases:
1
If M is a vector subspace (a flat manifold), one just needs to project
the Fréchet derivative appropriately [Davies ’04]
2
If not, to do the same we need a few tools from differential geometry.
Notation:
F (base field), K (ambient field) ∈ {R, C} (K = R ⇒ F = R);
M, an F-differentiable submanifold of Kn×n ;
f : M → N ⊆ Kn×n , an F-differentiable function.
Vanni Noferini
Structured conditioning of matrix maps
7 / 22
The differential
It plays the role of the derivative for functions between manifolds: it is the
best (local) F-linear approximation to f and it maps the tangent spaces of
the domain and the image.
df(A)
10
5
TA(M)
0
10
Tf(A)(N)
5
-5
0
-10
-5
-15
-20
-10
-25
-15
-20
-30
-35
4
-25
M
3
1
2
0
-35
-4
-2
0
-1
-2
-2
-3
-4
Vanni Noferini
N
-30
4
2
-4
4
2
0
0
2
-2
4
-4
f(A)
Structured conditioning of matrix maps
8 / 22
Theoretical tool: differential geometry
Theorem
Let M ⊆ Kn×n be an F-differentiable manifold and f : M → N be
F-differentiable, then
conds (f , X ) = kdf (X )k.
Vanni Noferini
Structured conditioning of matrix maps
9 / 22
How to choose F?
In this theory K is the ambient field (the field in which the elements of
your matrices lie).
Vanni Noferini
Structured conditioning of matrix maps
10 / 22
How to choose F?
In this theory K is the ambient field (the field in which the elements of
your matrices lie).
How does one pick the base field F?
Vanni Noferini
Structured conditioning of matrix maps
10 / 22
How to choose F?
In this theory K is the ambient field (the field in which the elements of
your matrices lie).
How does one pick the base field F?
If K = C, the manifold M is complex differentiable, and the function f is
complex differentiable, then F = C. Otherwise, if at least one object is only
real differentiable, F = R.
Vanni Noferini
Structured conditioning of matrix maps
10 / 22
From theory to computation
The theory is simple and elegant, but computing the (un)structured
condition number is an issue, since the cost of a naive algorithm is O(n5 )
(u.c.n.) or O(n6 ) (s.c.n.). Therefore, we now need to dirty our hands a
bit...
Vanni Noferini
Structured conditioning of matrix maps
11 / 22
From theory to computation
The theory is simple and elegant, but computing the (un)structured
condition number is an issue, since the cost of a naive algorithm is O(n5 )
(u.c.n.) or O(n6 ) (s.c.n.). Therefore, we now need to dirty our hands a
bit...
TX M is an F-linear subspace of Kn×n of dimension, say, p
Vanni Noferini
Structured conditioning of matrix maps
11 / 22
From theory to computation
The theory is simple and elegant, but computing the (un)structured
condition number is an issue, since the cost of a naive algorithm is O(n5 )
(u.c.n.) or O(n6 ) (s.c.n.). Therefore, we now need to dirty our hands a
bit...
TX M is an F-linear subspace of Kn×n of dimension, say, p
2
Therefore it admits a basis that we represent as a matrix B ∈ Kn ×p .
B depends on X unless the manifold is flat!
Vanni Noferini
Structured conditioning of matrix maps
11 / 22
From theory to computation
The theory is simple and elegant, but computing the (un)structured
condition number is an issue, since the cost of a naive algorithm is O(n5 )
(u.c.n.) or O(n6 ) (s.c.n.). Therefore, we now need to dirty our hands a
bit...
TX M is an F-linear subspace of Kn×n of dimension, say, p
2
Therefore it admits a basis that we represent as a matrix B ∈ Kn ×p .
B depends on X unless the manifold is flat!
Hence one can show that any (local) projection does not suffice... but
it almost does. For the F-norm condition number:
+
kKf (X )Bk2 kBk−1
2 ≤ conds (f , X ) ≤ kKf (X )Bk2 kB k2
Vanni Noferini
Structured conditioning of matrix maps
11 / 22
From theory to computation
The theory is simple and elegant, but computing the (un)structured
condition number is an issue, since the cost of a naive algorithm is O(n5 )
(u.c.n.) or O(n6 ) (s.c.n.). Therefore, we now need to dirty our hands a
bit...
TX M is an F-linear subspace of Kn×n of dimension, say, p
2
Therefore it admits a basis that we represent as a matrix B ∈ Kn ×p .
B depends on X unless the manifold is flat!
Hence one can show that any (local) projection does not suffice... but
it almost does. For the F-norm condition number:
+
kKf (X )Bk2 kBk−1
2 ≤ conds (f , X ) ≤ kKf (X )Bk2 kB k2
Note that if B ∗ B = Ip then kBk2 = kB + k2 = 1: an orthogonal projection
does suffice!
Vanni Noferini
Structured conditioning of matrix maps
11 / 22
Computational tool: the power method
This algorithm applies the power method to estimate a lower bound
γ ≤ kKf (A)Bk2 .
1
2
3
4
5
6
7
8
9
Start with z0 ∈ Fp .
for k = 0: ∞
vec(Ek ) = Bzk
Wk+1 = Lf (A, Ek )
Ek+1 = L?f (A, Wk+1 )
γk+1 = kZk+1 kF /kWk+1 kF
zk+1 = B ∗ vec(Yk+1 )
if converged, γ = γk+1 , quit, end
end
L? (A, E ) =
f
Vanni Noferini
Lf (AT , E ), K = R
Lf¯ (A∗ , E ), K = C
Structured conditioning of matrix maps
12 / 22
So what can one do?
Vanni Noferini
Structured conditioning of matrix maps
13 / 22
So what can one do?
1
Typically, p = O(n2 ).
Vanni Noferini
Structured conditioning of matrix maps
13 / 22
So what can one do?
1
2
Typically, p = O(n2 ).
Estimating from below kKf (X )Bk via a power method costs O(n2 p)
in general, but for many special cases the structure in B can be
exploited to reduce this cost to O(np).
Vanni Noferini
Structured conditioning of matrix maps
13 / 22
So what can one do?
1
2
3
Typically, p = O(n2 ).
Estimating from below kKf (X )Bk via a power method costs O(n2 p)
in general, but for many special cases the structure in B can be
exploited to reduce this cost to O(np).
In most applications, a basis B either costs O(n3 ) or comes for free –
no computations needed. Similarly, kBk and kB + k are typically either
known from theoretical observations or computable O(n3 ).
Vanni Noferini
Structured conditioning of matrix maps
13 / 22
So what can one do?
1
2
3
4
Typically, p = O(n2 ).
Estimating from below kKf (X )Bk via a power method costs O(n2 p)
in general, but for many special cases the structure in B can be
exploited to reduce this cost to O(np).
In most applications, a basis B either costs O(n3 ) or comes for free –
no computations needed. Similarly, kBk and kB + k are typically either
known from theoretical observations or computable O(n3 ).
Orthonormalizing B costs O(n2 p 2 ) except for very exceptional
manifolds (for which B is orthornomal since the begininning)
Vanni Noferini
Structured conditioning of matrix maps
13 / 22
So what can one do?
1
2
3
4
Typically, p = O(n2 ).
Estimating from below kKf (X )Bk via a power method costs O(n2 p)
in general, but for many special cases the structure in B can be
exploited to reduce this cost to O(np).
In most applications, a basis B either costs O(n3 ) or comes for free –
no computations needed. Similarly, kBk and kB + k are typically either
known from theoretical observations or computable O(n3 ).
Orthonormalizing B costs O(n2 p 2 ) except for very exceptional
manifolds (for which B is orthornomal since the begininning)
⇒ Typically, computing a rigorous lower bound via the power method costs
O(n3 ) while computing the exact structured condition number costs O(n6 ).
Vanni Noferini
Structured conditioning of matrix maps
13 / 22
Structured Matrices
A scalar product h·, ·iM is a non degenerate (M nonsingular) bilinear or
sesquilinear form on Kn .
T
x My
real or complex bilinear forms
hx, y iM =
x ∗ My
sesquilinear forms
Any matrix A has a unique adjoint A? defined by
hAx, y iM = hx, A? y iM
The formula for A? is given by
−1 T
M A M,
?
A =
M −1 A∗ M,
Vanni Noferini
∀x, y ∈ Kn .
real or complex bilinear forms
sesquilinear forms
Structured conditioning of matrix maps
14 / 22
Automorphism Groups
Having fixed the scalar product M, the associated automorphism group is
GM := {G ∈ Kn×n : hGx, Gy iM = hx, y iM } = {G ∈ Kn×n : G ? = G −1 }.
Vanni Noferini
Structured conditioning of matrix maps
15 / 22
Automorphism Groups
Having fixed the scalar product M, the associated automorphism group is
GM := {G ∈ Kn×n : hGx, Gy iM = hx, y iM } = {G ∈ Kn×n : G ? = G −1 }.
Theorem
GM is is an R-differentiable manifold of Kn×n , and also a C-differentiable
manifold for K = C and complex bilinear forms M.
Vanni Noferini
Structured conditioning of matrix maps
15 / 22
Tangent spaces and Lie algebras
Having fixed the scalar product M, the associated Lie algebra is
LM := {F ∈ Kn×n : hFx, y iM = −hx, Fy iM } = {G ∈ Kn×n : G ? = −G }.
Vanni Noferini
Structured conditioning of matrix maps
16 / 22
Tangent spaces and Lie algebras
Having fixed the scalar product M, the associated Lie algebra is
LM := {F ∈ Kn×n : hFx, y iM = −hx, Fy iM } = {G ∈ Kn×n : G ? = −G }.
Theorem
Let X ∈ GM , then
TX GM = {E ∈ Fn×n |E = XF , F ∈ LM }
Hence a basis for TX GM is B = (In ⊗ X M −1 )D where D is a certain
matrix with orthonormal columns and does not depend on X .
Vanni Noferini
Structured conditioning of matrix maps
16 / 22
Tangent spaces and Lie algebras
Having fixed the scalar product M, the associated Lie algebra is
LM := {F ∈ Kn×n : hFx, y iM = −hx, Fy iM } = {G ∈ Kn×n : G ? = −G }.
Theorem
Let X ∈ GM , then
TX GM = {E ∈ Fn×n |E = XF , F ∈ LM }
Hence a basis for TX GM is B = (In ⊗ X M −1 )D where D is a certain
matrix with orthonormal columns and does not depend on X .
Theorem
For the F-norm structured condition number and X ∈ M = GM :
kKf (X )Bk2
≤ conds (f , X ) ≤ kKf (X )Bk2 kX k2 kMk2 .
kM −1 k2 kX k2
Vanni Noferini
Structured conditioning of matrix maps
16 / 22
Let’s be practical
Here are a few commonly used practical choices of M:
M
I
J
R
Ip,q
Structure of M
In 0
Im
−Im 0


1


 ... 
1
Ip ⊕ (−Iq )
Vanni Noferini
Name of GM
Orthogonal, unitary
Where they appear
Everywhere
Symplectic
Physics, Engineering
Perplectic
Maths, Chemistry
Pseudo-orthogonal
Physics
Structured conditioning of matrix maps
17 / 22
Let’s be practical
Here are a few commonly used practical choices of M:
M
I
J
R
Ip,q
Structure of M
In 0
Im
−Im 0


1


 ... 
1
Ip ⊕ (−Iq )
Name of GM
Orthogonal, unitary
Where they appear
Everywhere
Symplectic
Physics, Engineering
Perplectic
Maths, Chemistry
Pseudo-orthogonal
Physics
And now for a few pictures...
Vanni Noferini
Structured conditioning of matrix maps
17 / 22
log(A) : GM → LM
10 10
10 8
cond struc
cond
upper
lower
6
10
8
10
6
10 4
10 4
10 2
10 2
10 0
10 0
10 -2
10 -2
10
10 -4
10 5
52(A)=||A||2||A -1 ||2
A−1 = −JAT J
0 In
J=
−In 0
Vanni Noferini
cond struc
cond
upper
lower
10 5
52(A)=||A||2||A -1 ||2
A−1 = RAT R
h
i
. 1
.
R=
.
1
Structured conditioning of matrix maps
18 / 22
log(A) : GM → LM
10 8
10 6
10 6
cond struc
cond
upper
lower
10
10 4
10
4
cond struc
cond
upper
lower
10 2
2
10 0
10 0
10 -2
10 -2
10 -4
10 -4
10 5
10 5
-1
52(A)=||A||2||A -1 ||2
52(A)=||A||2||A ||2
A−1 = I5,5 AT I5,5
A−1 = I9,1 AT I9,1
Ip,q
Vanni Noferini
I
0
= p
0 −Iq
Structured conditioning of matrix maps
19 / 22
√
A : GM → GM
10 8
10 6
10 10
cond struc
cond
upper
lower
10
8
cond struc
cond
upper
lower
10 6
10 4
10 4
10 2
10 2
10 0
10 -2
Vanni Noferini
10
0
10 -2
10 5
10 5
52(A)=||A||2||A -1 ||2
52(A)=||A||2||A -1 ||2
A−1 = −JAT J
0
Im
J=
−Im 0
A−1 = RAT R
h
i
. 1
.
R=
.
1
Structured conditioning of matrix maps
20 / 22
√
A : GM → GM
10 8
10
6
10 8
cond struc
cond
upper
lower
10
6
cond struc
cond
upper
lower
10 4
10 4
10 2
10
2
10 0
10 0
10 -2
10 -2
10 -4
10 5
10 5
52(A)=||A||2||A -1 ||2
52(A)=||A||2||A -1 ||2
A−1 = I5,5 AT I5,5
A−1 = I9,1 AT I9,1
Ip,q
Vanni Noferini
I
0
= p
0 −Iq
Structured conditioning of matrix maps
21 / 22
Overview
Vanni Noferini
Structured conditioning of matrix maps
22 / 22
Overview
Framework to define and compute structured condition numbers of
maps between matrix manifolds
Vanni Noferini
Structured conditioning of matrix maps
22 / 22
Overview
Framework to define and compute structured condition numbers of
maps between matrix manifolds
The condition number = expensive, but lower bound = cheap. The
lower bound is often a very accurate approximation! We also can
cheaply compute “rough” upper bounds (that usually work).
Vanni Noferini
Structured conditioning of matrix maps
22 / 22
Overview
Framework to define and compute structured condition numbers of
maps between matrix manifolds
The condition number = expensive, but lower bound = cheap. The
lower bound is often a very accurate approximation! We also can
cheaply compute “rough” upper bounds (that usually work).
Applying our tools to practical choices of f and M: evidence that
often s.c.n. u.c.n.
Vanni Noferini
Structured conditioning of matrix maps
22 / 22
Overview
Framework to define and compute structured condition numbers of
maps between matrix manifolds
The condition number = expensive, but lower bound = cheap. The
lower bound is often a very accurate approximation! We also can
cheaply compute “rough” upper bounds (that usually work).
Applying our tools to practical choices of f and M: evidence that
often s.c.n. u.c.n.
Quoth the mufloun: I want more structure-preserving algorithms!
Vanni Noferini
Structured conditioning of matrix maps
22 / 22