Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The structured condition number of a differentiable function between matrix manifolds Vanni Noferini (University of Essex) Joint work with B. Arslan (TU Bursa) F. Tisseur (U Manchester) 16/2/2017, 2gALN, Como (mufloun land) Vanni Noferini Structured conditioning of matrix maps 1 / 22 Motivation Let a > 0. Let us build a symplectic matrix as follows: a e 0 a 0 X = ⇒ ln X = 0 e −a 0 −a (Its matrix logarithm must be skew-Hamiltonian) Vanni Noferini Structured conditioning of matrix maps 2 / 22 Motivation Let a > 0. Let us build a symplectic matrix as follows: a e 0 a 0 X = ⇒ ln X = 0 e −a 0 −a (Its matrix logarithm must be skew-Hamiltonian) A tiny (norm = kX k) perturbation in X ... a e 0 a 0 0 0 X = ⇒ ln X = + + o() 0 e a + e −a 0 −a 0 e 2a Vanni Noferini Structured conditioning of matrix maps 2 / 22 Motivation Let a > 0. Let us build a symplectic matrix as follows: a e 0 a 0 X = ⇒ ln X = 0 e −a 0 −a (Its matrix logarithm must be skew-Hamiltonian) A tiny (norm = kX k) perturbation in X ... a e 0 a 0 0 0 X = ⇒ ln X = + + o() 0 e a + e −a 0 −a 0 e 2a It cannot be worse than this, so the relative conditioning is e 2a a Vanni Noferini Structured conditioning of matrix maps 2 / 22 However... ea 0 a 0 X = ⇒ ln X = 0 e −a 0 −a Vanni Noferini Structured conditioning of matrix maps 3 / 22 However... ea 0 a 0 X = ⇒ ln X = 0 e −a 0 −a If we enforce that the symplectic structure is preserved the worse that we can do is much milder! " # a ae a 0 sinh(a) e e a a 0 X = ⇒ ln X = + + o() 0 e −a 0 −a 0 0 Vanni Noferini Structured conditioning of matrix maps 3 / 22 However... ea 0 a 0 X = ⇒ ln X = 0 e −a 0 −a If we enforce that the symplectic structure is preserved the worse that we can do is much milder! " # a ae a 0 sinh(a) e e a a 0 X = ⇒ ln X = + + o() 0 e −a 0 −a 0 0 The relative conditioning is ea e 2a < sinh(a) a ( for average or large values of a!) Vanni Noferini Structured conditioning of matrix maps 3 / 22 Condition number and derivatives Let K ∈ {R, C} and f : Kn×n 7→ Kn×n be K-Fréchet differentiable. Vanni Noferini Structured conditioning of matrix maps 4 / 22 Condition number and derivatives Let K ∈ {R, C} and f : Kn×n 7→ Kn×n be K-Fréchet differentiable. Define kf (Y ) − f (X )k . →0 kY −X k≤ cond(f , X ) = lim Vanni Noferini sup Structured conditioning of matrix maps 4 / 22 Condition number and derivatives Let K ∈ {R, C} and f : Kn×n 7→ Kn×n be K-Fréchet differentiable. Define kf (Y ) − f (X )k . →0 kY −X k≤ cond(f , X ) = lim sup Let Kf (X ) be the Fréchet derivative of f at X . Then, cond(f , X ) = kKf (X )k. Vanni Noferini Structured conditioning of matrix maps 4 / 22 Restricting the perturbation Suppose now X ∈ M, a submanifold of Kn×n . Backward error analysis of a structure preserving algorithm enforces a perturbation Y − X such that E := Y − X ∈ M. Vanni Noferini Structured conditioning of matrix maps 5 / 22 Restricting the perturbation Suppose now X ∈ M, a submanifold of Kn×n . Backward error analysis of a structure preserving algorithm enforces a perturbation Y − X such that E := Y − X ∈ M. Structured condition number: kf (Y ) − f (X )k . →0 kY −X k≤,Y −X ∈M conds (f , X ) = lim Vanni Noferini sup Structured conditioning of matrix maps 5 / 22 Restricting the perturbation E Vanni Noferini =⇒ SE Structured conditioning of matrix maps 6 / 22 Restricting the perturbation =⇒ E SE SE E Hence, by definition of supremum, conds (f , X ) ≤ cond(f , X ). Vanni Noferini Structured conditioning of matrix maps 6 / 22 Analyzing the condition number Two cases: 1 If M is a vector subspace (a flat manifold), one just needs to project the Fréchet derivative appropriately [Davies ’04] Vanni Noferini Structured conditioning of matrix maps 7 / 22 Analyzing the condition number Two cases: 1 If M is a vector subspace (a flat manifold), one just needs to project the Fréchet derivative appropriately [Davies ’04] 2 If not, to do the same we need a few tools from differential geometry. Vanni Noferini Structured conditioning of matrix maps 7 / 22 Analyzing the condition number Two cases: 1 If M is a vector subspace (a flat manifold), one just needs to project the Fréchet derivative appropriately [Davies ’04] 2 If not, to do the same we need a few tools from differential geometry. Notation: F (base field), K (ambient field) ∈ {R, C} (K = R ⇒ F = R); M, an F-differentiable submanifold of Kn×n ; f : M → N ⊆ Kn×n , an F-differentiable function. Vanni Noferini Structured conditioning of matrix maps 7 / 22 The differential It plays the role of the derivative for functions between manifolds: it is the best (local) F-linear approximation to f and it maps the tangent spaces of the domain and the image. df(A) 10 5 TA(M) 0 10 Tf(A)(N) 5 -5 0 -10 -5 -15 -20 -10 -25 -15 -20 -30 -35 4 -25 M 3 1 2 0 -35 -4 -2 0 -1 -2 -2 -3 -4 Vanni Noferini N -30 4 2 -4 4 2 0 0 2 -2 4 -4 f(A) Structured conditioning of matrix maps 8 / 22 Theoretical tool: differential geometry Theorem Let M ⊆ Kn×n be an F-differentiable manifold and f : M → N be F-differentiable, then conds (f , X ) = kdf (X )k. Vanni Noferini Structured conditioning of matrix maps 9 / 22 How to choose F? In this theory K is the ambient field (the field in which the elements of your matrices lie). Vanni Noferini Structured conditioning of matrix maps 10 / 22 How to choose F? In this theory K is the ambient field (the field in which the elements of your matrices lie). How does one pick the base field F? Vanni Noferini Structured conditioning of matrix maps 10 / 22 How to choose F? In this theory K is the ambient field (the field in which the elements of your matrices lie). How does one pick the base field F? If K = C, the manifold M is complex differentiable, and the function f is complex differentiable, then F = C. Otherwise, if at least one object is only real differentiable, F = R. Vanni Noferini Structured conditioning of matrix maps 10 / 22 From theory to computation The theory is simple and elegant, but computing the (un)structured condition number is an issue, since the cost of a naive algorithm is O(n5 ) (u.c.n.) or O(n6 ) (s.c.n.). Therefore, we now need to dirty our hands a bit... Vanni Noferini Structured conditioning of matrix maps 11 / 22 From theory to computation The theory is simple and elegant, but computing the (un)structured condition number is an issue, since the cost of a naive algorithm is O(n5 ) (u.c.n.) or O(n6 ) (s.c.n.). Therefore, we now need to dirty our hands a bit... TX M is an F-linear subspace of Kn×n of dimension, say, p Vanni Noferini Structured conditioning of matrix maps 11 / 22 From theory to computation The theory is simple and elegant, but computing the (un)structured condition number is an issue, since the cost of a naive algorithm is O(n5 ) (u.c.n.) or O(n6 ) (s.c.n.). Therefore, we now need to dirty our hands a bit... TX M is an F-linear subspace of Kn×n of dimension, say, p 2 Therefore it admits a basis that we represent as a matrix B ∈ Kn ×p . B depends on X unless the manifold is flat! Vanni Noferini Structured conditioning of matrix maps 11 / 22 From theory to computation The theory is simple and elegant, but computing the (un)structured condition number is an issue, since the cost of a naive algorithm is O(n5 ) (u.c.n.) or O(n6 ) (s.c.n.). Therefore, we now need to dirty our hands a bit... TX M is an F-linear subspace of Kn×n of dimension, say, p 2 Therefore it admits a basis that we represent as a matrix B ∈ Kn ×p . B depends on X unless the manifold is flat! Hence one can show that any (local) projection does not suffice... but it almost does. For the F-norm condition number: + kKf (X )Bk2 kBk−1 2 ≤ conds (f , X ) ≤ kKf (X )Bk2 kB k2 Vanni Noferini Structured conditioning of matrix maps 11 / 22 From theory to computation The theory is simple and elegant, but computing the (un)structured condition number is an issue, since the cost of a naive algorithm is O(n5 ) (u.c.n.) or O(n6 ) (s.c.n.). Therefore, we now need to dirty our hands a bit... TX M is an F-linear subspace of Kn×n of dimension, say, p 2 Therefore it admits a basis that we represent as a matrix B ∈ Kn ×p . B depends on X unless the manifold is flat! Hence one can show that any (local) projection does not suffice... but it almost does. For the F-norm condition number: + kKf (X )Bk2 kBk−1 2 ≤ conds (f , X ) ≤ kKf (X )Bk2 kB k2 Note that if B ∗ B = Ip then kBk2 = kB + k2 = 1: an orthogonal projection does suffice! Vanni Noferini Structured conditioning of matrix maps 11 / 22 Computational tool: the power method This algorithm applies the power method to estimate a lower bound γ ≤ kKf (A)Bk2 . 1 2 3 4 5 6 7 8 9 Start with z0 ∈ Fp . for k = 0: ∞ vec(Ek ) = Bzk Wk+1 = Lf (A, Ek ) Ek+1 = L?f (A, Wk+1 ) γk+1 = kZk+1 kF /kWk+1 kF zk+1 = B ∗ vec(Yk+1 ) if converged, γ = γk+1 , quit, end end L? (A, E ) = f Vanni Noferini Lf (AT , E ), K = R Lf¯ (A∗ , E ), K = C Structured conditioning of matrix maps 12 / 22 So what can one do? Vanni Noferini Structured conditioning of matrix maps 13 / 22 So what can one do? 1 Typically, p = O(n2 ). Vanni Noferini Structured conditioning of matrix maps 13 / 22 So what can one do? 1 2 Typically, p = O(n2 ). Estimating from below kKf (X )Bk via a power method costs O(n2 p) in general, but for many special cases the structure in B can be exploited to reduce this cost to O(np). Vanni Noferini Structured conditioning of matrix maps 13 / 22 So what can one do? 1 2 3 Typically, p = O(n2 ). Estimating from below kKf (X )Bk via a power method costs O(n2 p) in general, but for many special cases the structure in B can be exploited to reduce this cost to O(np). In most applications, a basis B either costs O(n3 ) or comes for free – no computations needed. Similarly, kBk and kB + k are typically either known from theoretical observations or computable O(n3 ). Vanni Noferini Structured conditioning of matrix maps 13 / 22 So what can one do? 1 2 3 4 Typically, p = O(n2 ). Estimating from below kKf (X )Bk via a power method costs O(n2 p) in general, but for many special cases the structure in B can be exploited to reduce this cost to O(np). In most applications, a basis B either costs O(n3 ) or comes for free – no computations needed. Similarly, kBk and kB + k are typically either known from theoretical observations or computable O(n3 ). Orthonormalizing B costs O(n2 p 2 ) except for very exceptional manifolds (for which B is orthornomal since the begininning) Vanni Noferini Structured conditioning of matrix maps 13 / 22 So what can one do? 1 2 3 4 Typically, p = O(n2 ). Estimating from below kKf (X )Bk via a power method costs O(n2 p) in general, but for many special cases the structure in B can be exploited to reduce this cost to O(np). In most applications, a basis B either costs O(n3 ) or comes for free – no computations needed. Similarly, kBk and kB + k are typically either known from theoretical observations or computable O(n3 ). Orthonormalizing B costs O(n2 p 2 ) except for very exceptional manifolds (for which B is orthornomal since the begininning) ⇒ Typically, computing a rigorous lower bound via the power method costs O(n3 ) while computing the exact structured condition number costs O(n6 ). Vanni Noferini Structured conditioning of matrix maps 13 / 22 Structured Matrices A scalar product h·, ·iM is a non degenerate (M nonsingular) bilinear or sesquilinear form on Kn . T x My real or complex bilinear forms hx, y iM = x ∗ My sesquilinear forms Any matrix A has a unique adjoint A? defined by hAx, y iM = hx, A? y iM The formula for A? is given by −1 T M A M, ? A = M −1 A∗ M, Vanni Noferini ∀x, y ∈ Kn . real or complex bilinear forms sesquilinear forms Structured conditioning of matrix maps 14 / 22 Automorphism Groups Having fixed the scalar product M, the associated automorphism group is GM := {G ∈ Kn×n : hGx, Gy iM = hx, y iM } = {G ∈ Kn×n : G ? = G −1 }. Vanni Noferini Structured conditioning of matrix maps 15 / 22 Automorphism Groups Having fixed the scalar product M, the associated automorphism group is GM := {G ∈ Kn×n : hGx, Gy iM = hx, y iM } = {G ∈ Kn×n : G ? = G −1 }. Theorem GM is is an R-differentiable manifold of Kn×n , and also a C-differentiable manifold for K = C and complex bilinear forms M. Vanni Noferini Structured conditioning of matrix maps 15 / 22 Tangent spaces and Lie algebras Having fixed the scalar product M, the associated Lie algebra is LM := {F ∈ Kn×n : hFx, y iM = −hx, Fy iM } = {G ∈ Kn×n : G ? = −G }. Vanni Noferini Structured conditioning of matrix maps 16 / 22 Tangent spaces and Lie algebras Having fixed the scalar product M, the associated Lie algebra is LM := {F ∈ Kn×n : hFx, y iM = −hx, Fy iM } = {G ∈ Kn×n : G ? = −G }. Theorem Let X ∈ GM , then TX GM = {E ∈ Fn×n |E = XF , F ∈ LM } Hence a basis for TX GM is B = (In ⊗ X M −1 )D where D is a certain matrix with orthonormal columns and does not depend on X . Vanni Noferini Structured conditioning of matrix maps 16 / 22 Tangent spaces and Lie algebras Having fixed the scalar product M, the associated Lie algebra is LM := {F ∈ Kn×n : hFx, y iM = −hx, Fy iM } = {G ∈ Kn×n : G ? = −G }. Theorem Let X ∈ GM , then TX GM = {E ∈ Fn×n |E = XF , F ∈ LM } Hence a basis for TX GM is B = (In ⊗ X M −1 )D where D is a certain matrix with orthonormal columns and does not depend on X . Theorem For the F-norm structured condition number and X ∈ M = GM : kKf (X )Bk2 ≤ conds (f , X ) ≤ kKf (X )Bk2 kX k2 kMk2 . kM −1 k2 kX k2 Vanni Noferini Structured conditioning of matrix maps 16 / 22 Let’s be practical Here are a few commonly used practical choices of M: M I J R Ip,q Structure of M In 0 Im −Im 0 1 ... 1 Ip ⊕ (−Iq ) Vanni Noferini Name of GM Orthogonal, unitary Where they appear Everywhere Symplectic Physics, Engineering Perplectic Maths, Chemistry Pseudo-orthogonal Physics Structured conditioning of matrix maps 17 / 22 Let’s be practical Here are a few commonly used practical choices of M: M I J R Ip,q Structure of M In 0 Im −Im 0 1 ... 1 Ip ⊕ (−Iq ) Name of GM Orthogonal, unitary Where they appear Everywhere Symplectic Physics, Engineering Perplectic Maths, Chemistry Pseudo-orthogonal Physics And now for a few pictures... Vanni Noferini Structured conditioning of matrix maps 17 / 22 log(A) : GM → LM 10 10 10 8 cond struc cond upper lower 6 10 8 10 6 10 4 10 4 10 2 10 2 10 0 10 0 10 -2 10 -2 10 10 -4 10 5 52(A)=||A||2||A -1 ||2 A−1 = −JAT J 0 In J= −In 0 Vanni Noferini cond struc cond upper lower 10 5 52(A)=||A||2||A -1 ||2 A−1 = RAT R h i . 1 . R= . 1 Structured conditioning of matrix maps 18 / 22 log(A) : GM → LM 10 8 10 6 10 6 cond struc cond upper lower 10 10 4 10 4 cond struc cond upper lower 10 2 2 10 0 10 0 10 -2 10 -2 10 -4 10 -4 10 5 10 5 -1 52(A)=||A||2||A -1 ||2 52(A)=||A||2||A ||2 A−1 = I5,5 AT I5,5 A−1 = I9,1 AT I9,1 Ip,q Vanni Noferini I 0 = p 0 −Iq Structured conditioning of matrix maps 19 / 22 √ A : GM → GM 10 8 10 6 10 10 cond struc cond upper lower 10 8 cond struc cond upper lower 10 6 10 4 10 4 10 2 10 2 10 0 10 -2 Vanni Noferini 10 0 10 -2 10 5 10 5 52(A)=||A||2||A -1 ||2 52(A)=||A||2||A -1 ||2 A−1 = −JAT J 0 Im J= −Im 0 A−1 = RAT R h i . 1 . R= . 1 Structured conditioning of matrix maps 20 / 22 √ A : GM → GM 10 8 10 6 10 8 cond struc cond upper lower 10 6 cond struc cond upper lower 10 4 10 4 10 2 10 2 10 0 10 0 10 -2 10 -2 10 -4 10 5 10 5 52(A)=||A||2||A -1 ||2 52(A)=||A||2||A -1 ||2 A−1 = I5,5 AT I5,5 A−1 = I9,1 AT I9,1 Ip,q Vanni Noferini I 0 = p 0 −Iq Structured conditioning of matrix maps 21 / 22 Overview Vanni Noferini Structured conditioning of matrix maps 22 / 22 Overview Framework to define and compute structured condition numbers of maps between matrix manifolds Vanni Noferini Structured conditioning of matrix maps 22 / 22 Overview Framework to define and compute structured condition numbers of maps between matrix manifolds The condition number = expensive, but lower bound = cheap. The lower bound is often a very accurate approximation! We also can cheaply compute “rough” upper bounds (that usually work). Vanni Noferini Structured conditioning of matrix maps 22 / 22 Overview Framework to define and compute structured condition numbers of maps between matrix manifolds The condition number = expensive, but lower bound = cheap. The lower bound is often a very accurate approximation! We also can cheaply compute “rough” upper bounds (that usually work). Applying our tools to practical choices of f and M: evidence that often s.c.n. u.c.n. Vanni Noferini Structured conditioning of matrix maps 22 / 22 Overview Framework to define and compute structured condition numbers of maps between matrix manifolds The condition number = expensive, but lower bound = cheap. The lower bound is often a very accurate approximation! We also can cheaply compute “rough” upper bounds (that usually work). Applying our tools to practical choices of f and M: evidence that often s.c.n. u.c.n. Quoth the mufloun: I want more structure-preserving algorithms! Vanni Noferini Structured conditioning of matrix maps 22 / 22