You are on page 1of 246
SYS) | and — LINEAR SYSTEMS SCHAUM’S OUTLINE OF THEORY AND PROBLEMS oF STATE SPACE and LINEAR SYSTEMS BY DONALD M. WIBERG, Ph.D. Associate Professor of Engineering University of California, Los Angeles SCHAUM’S OUTLINE SERIES McGRAW-HILL BOOK COMPANY New York, St. Louis, San Francisco, Diisseldor}, Johannesburg, Kuala Lumpur, London, Mexico, ‘Montreal, New Delhi, Panama, Rio de Janeiro, Singapore, Sydney, and Toronto Copyright © 1971 by McGraw-Hill, Inc. All Rights Reserved. Printed in the United States of America, No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. 07-070096-6 567801011 12181415 SHSH 7548210608 Preface ‘The importance of state space analysis is recognized in fields where the time behavior of any physical process is of interest. The concept of state is comparatively recent, but the methods used have been known to mathematicians for many years. As engineering, physics, Medicine, economics, and business become more cognizant of the insight that the state space approach offers, its popularity increases. ‘This book was written not only for upper division and graduate students, but for prac- ticing professionals as well. It is an attempt to bridge the gap between theory and practical use of the state space approach to the analysis and design of dynamical systems. The book is meant to encourage the use of state space as a tool for analysis and design, in proper relation with other such tools. The state space approach is more general than the “classical” Laplace and Fourier transform theory. Consequently, state space theory is applicable to all systems that can be analyzed by integral transforms in time, and is applicable to many systems for which transform theory breaks down. Furthermore, state space theory gives a somewhat different insight into the time behavior of linear systems, and is worth studying for this aspect alone. In particular, the state space approach is useful because: (1) linear systems with time- varying parameters can be analyzed in essentially the same manner as time-invariant linear systems, (2) problems formulated by state space methods can easily be programmed on a computer, (8) high-order linear systems can be analyzed, (4) multiple input-multiple output systems can be treated almost as easily as single input-single output linear systems, and (8) state space theory is the foundation for further studies in such areas as nonlinear systems, stochastic systems, and optimal control. These are five of the most important advantages obtained from the generalization and rigorousness that state space brings to the classical transform theory. Because state space theory describes the time behavior of physical systems in a mathe- matical manner, the reader is assumed to have some knowledge of differential equations and of Laplace transform theory. Some classical control theory is needed for Chapter 8 only. No knowledge of matrices or complex variables is prerequisite. The book may appear to contain too many theorems to be comprehensible and/or useful to the nonmathematician. But the theorems have been stated and proven in a manner suited to show the range of application of the ideas and their logical interdependence. Space that might otherwise have been devoted to solved problems has been used instead to present the physical motivation of the proofs. Consequently I give my strongest recom- mendation that the reader seek to understand the physical ideas underlying the proofs rather than to merely memorize the theorems, Since the emphasis is on applications, the book might not be rigorous enough for the pure mathematician, but I feel that enough informa- tion has been provided so that he can tidy up the statements and proofs himself. The book has a number of novel features. Chapter 1 gives the fundamental ideas of state from an informal, physical viewpoint, and also gives a correct statement of linearity. Chapter 2 shows how to write transfer functions and ordinary differential equations in matrix notation, thus motivating the material on matrices to follow. Chapter 3 develops the important concepts of range space and null space in detail, for later application. Also exterior products (Grassmann algebra) are developed, which give insight into determinants, and which considerably shorten a number of later proofs. Chapter 4 shows how to actually solve for the Jordan form, rather than just proving its existence. Also a detailed treatment of pseudoinverses is given. Chapter 5 gives techniques for computation of transition matrices for high-order time-invariant systems, and contrasts this with a detailed develop- ment of transition matrices for time-varying systems, Chapter 6 starts with giving physical insight into controllability and observability of simple systems, and progresses to the point of giving algebraic criteria for time-varying systems. Chapter 7 shows how to reduce a system to its essential parameters. Chapter 8 is perhaps the most novel. Techniques from classical control theory are extended to time-varying, multiple input-multiple output linear systems using state space formulation. This gives practical methods for control system design, as well as analysis. Furthermore, the pole placement and observer theory developed ean serve as an introduction to linear optimal control and to Kalman filtering. Chapter 9 considers asymptotic stability of linear systems, and the usual restriction of uniformity is dispensed with. Chapter 10 gives motivation for the quadratic optimal control problem, with special emphasis on the practical time-invariant problem and its associated computa- tional techniques. Since Chapters 6,8, and 9 precede, relations with controllability, pole placement, and stability properties can be explored. The book has come from a set of notes developed for engineering course 122B at UCLA, originally dating from 1966. It was given to the publisher in June 1969. Unfortunately, the publication delay has dated some of the material. Fortunately, it also enabled a number of errors to be weeded out. Now I would like to apologize because I have not included references, historical develop- ment, and credit to the originators of each idea. This was simply impossible to do because of the outline nature of the book. I would like to express my appreciation to those who helped me write this book. Chapter 1 was written with a great deal of help from A. V. Balakrishnan. L. M. Silverman helped with Chapter 7 and P.K.C. Wang with Chapter 9. Interspersed throughout the book is material from a course given by R. B. Kalman during the spring of 1961 at Caltech. J. J. DiStefano, R. C. Erdmann, N. Levan, and K. Yao have used the notes as a text in UCLA course 122B and have given me suggestions. I have had discussions with R, E. Mortensen, ‘M. M. Sholar, A. R. Stubberud, D. R. Vaughan, and many other colleagues. Improvements in the final draft were made through the help of the control group under the direction of J. Ackermann at the DFVLR in Oberpfaffenhofen, West Germany, especially by G. Griibel and R. Sharma. Also, I want to thank those UCLA students, too numerous to mention, that have served as guinea pigs and have caught many errors of mine. Ruthie Alperin was very efficient as usual while typing the text. David Beckwith, Henry Hayden, and Daniel Schaum helped publish the book in its present form. Finally, I want to express my appreciation of my wife Merideth and my children Erik and Kristin for their understanding during the long hours of involvement with the book. DONALD M. WrBERG University of California, Los Angeles June 1971 CONTENTS Page Chapter MEANING OF STATE ............. Peete Introduction to State. State of an Abstract Object. Trajectories in State Space. Dynamical Systems. Linearity and Time Invariance. Systems Con- sidered. Linearization of Nonlinear Systems. Chapter METHODS FOR OBTAINING THE STATE EQUATIONS ...... 16 Flow Diagrams. Properties of Flow Diagrams, Canonical Flow Diagrams for Time-Invariant Systems. Jordan Flow Diagram, Time-Varying Systems, General State Equations. Chapter ELEMENTARY MATRIX THEORY . ceeeeee 88 Introduction. Basle Definitions. Basie Operations. lees. Deter minants and Inverse Matrices. Vector Spaces. Bases, Solution of Sets of Linear Algebraic Equations. Generalization of a Vector. Distance in a Vector Space. Reciprocal Basis, Matrix Representation of a Linear Operator, Ex- terior Products, Chapter MATRIX ANALYSIS .........000..0000000005 eee) Eigenvalues and Eigenvectors, Introduction to the Similarity Transformation. Properties of Similarity Transformations, Jordan Form, Quadratic Forms. Matrix Norms. Functions of a Matrix. Pseudoinverse. Chapter SOLUTIONS TO THE LINEAR STATE EQUATION 99 Transition Matrix. Caleulation of the Transition Matrix for Time-Tnvariant Systems. Transition Matrix for Time-Varying Differential Systems. Closed Forms for Special Cases of Time-Varying Linear Differential Systems. Peri- odically-Varying Linear Differential Systems. Solution of the Linear State Equations with Input. Transition Matrix for Time-Varying Difference Equa- tions. Impulse Response "The Adjoint System, Chapter CONTROLLABILITY AND OBSERVABILITY Introduction to Controllability and Observability. _ Control Invariant Linear Systems, Observability in Time-Invariant Linear Systems. Direct Criteria from A, B, and C. Controllability and Observability of Time- Varying Systems. Duality. CONTENTS Page Chapter 7 CANONICAL FORMS OF THE STATE EQUATION ... “7 Introduction to Canonical Forms. Jordan Form for Time-Invariant Systems. Real Jordan Form. Controllable and Observable Forms for Time-Varying Systema. Canonies] Forms for Time-Varying Systems. Chapter 8 RELATIONS WITH CLASSICAL TECHNIQUES . oe 164 Introduction. Matrix Flow Diagrams. Steady State Errors. Root Locus. Nyquist Diagrams, State Feedback Pole Placement. Observer Systems. “Algebraie Separation, Sensitivity, Noise Rejection, and Nonlinear Effects. Chapter 9 STABILITY OF LINEAR SYSTEMS ........ - . 191 Introduction, Definitions of Stability for Zero-Input Linear Systems. De- finitions of Stability for Nonzero Inputs. Liapunov Techniques. Liapunoy Functions for Linear Systems. Equations for the Construction of Lispanoy Funetions. Chapter 10 | INTRODUCTION TO OPTIMAL CONTROL = 208 Introduction, ‘The Criterion Functional. Derivation of the Optimal Control Law. The Matrix Riceati Equation, Time-Invariant Optimal Systems, Out- put Feedback, The Servomechanism Problem. Conclusion, INDEX ....... Chapter 1 Meaning of State 11 INTRODUCTION TO STATE To introduce the subject, let’s take an informal, physical approach to the idea of state. (An exact mathematical approach is taken in more advanced texts.) First, we make a distinction between physical and abstract objects. A physical object is an object perceived by our senses whose time behavior we wish to describe, and its abstraction is the mathe- matical relationships that give some expression for its behavior. This distinction is made because, in making an abstraction, it is possible to lose some of the relationships that make the abstraction behave similar to the physical object. Also, not all mathematical relation- ships can be realized by a physical object. The concept of state relates to those physical objects whose behavior can change with time, and to which a stimulus can be applied and the response observed. To predict the future behavior of the physical object under any input, a series of experiments could be performed by applying stimuli, or inputs, and observing the responses, or outputs, From these experiments we could obtain a listing of these inputs and their corresponding observed outputs, ie. a list of input-output pairs. An input-output pair is an ordered pair of real time functions defined for all ¢= to, where ts is the time the input is first applied. Of course segments of these input time functions must be consistent and we must agree upon what kind of functions to consider, but in this introduction we shall not go into these mathematical details, Definition 1.1: The state of a physical object is any property of the object which relates input to output such that knowledge of the input time function for t= te and state at time t= ts completely determines a unique output for t= to. Example 1, Consider a black box, Fig. 1-1, contain- ing a switch to one of two voltage dividers. Intuitively, the state of the box is the posi= tion of the switch, which agrees with Defi- nition 1.1. This can be ascertained by the experiment of applying a voltage V to the input terminal. Natural laws (Ohm's law) dictate that if the switch is in the lower position A, the output voltage is /2, and if the switch is in the upper position B, the output voltage is V/4. Then the state A determines the input-output pair to be (V,V/2), and the state B corresponds to ,¥/9), Fig. 12 STATE OF AN ABSTRACT OBJECT The basic ideas contained in the above example can be extended to many physical objects and to the abstract relationships describing their time behavior. This will be done after abstracting the properties of physical objects such as the black box. For example, the color of the box has no effect on the experiment of applying a voltage. More subtly, the value of resistance F is immaterial if it is greater than zero. All that is needed is a listing of every input-output pair over all segments of time > ts, and the corresponding states at time to. 2 MEANING OF STATE [CHAP. 1 Definition 1.2: An abstract object is the totality of input-output pairs that describe the behavior of a physical object. Instead of a specific list of input time functions and their corresponding output time functions, the abstract object is usually characterized as a class of all time functions that obey a set of mathematical equations. ‘This is in accord with the scientific method of hypothesizing an equation and then checking to see that the physical object behaves in a manner similar to that predicted by the equation. Hence we can often summarize the abstract object by using the mathematical equations representing physical laws. The mathematical relations which summarize an abstract object must be oriented, in that m of the time functions that obey the relations must be designated inputs (denoted by the vector u, having m elements 1) and & of the time functions must be designated outputs (denoted by the vector y, having k elements yi). This need has nothing to do with causality, in that the outputs are not “caused” by the inputs. Definition 1. : The state of an abstract object is a collection of numbers which together with the input u(t) for all t= to uniquely determines the output y(t) for all t= to. Im essence the state parametrizes the listing of input-output pairs. ‘The state is the answer to the question “Given u(t) for ¢™ to and the mathematical relationships of the abstract object, what additional information is needed to completely specify y(t) for t= to?” Example 12. ‘A physical object is the resistor-eapacitor network, shown in Fig. 1-2. An experiment is performed by applying a voltage w(t), the input, and measuring a voltage y(@), the R output. Note that another experiment could be to apply @-————-WiW__— (0) and measure u(t), so that these choices are determined by the experiment. | ‘The list of all input-output pairs for this example is ult) c= wy the class of all functions u(t), y(é) which satisfy the mathe- matical relationship RCdyldt + y = 0 (11) ‘This summarizes the abstract object. The solution of (1.1) is 1 Wt) = ylty) ORE AT-OMRE ls) de (1.8) Fig.1-2 ‘This relationship explicitly exhibits the list of input-output pairs. For any input time function u(r) for +r ™ty, the output time funetion y(¢) is uniquely determined by y(t), a number at time ty Note the istinetion between time funetions and numbers, Thus the set of numbers y(t) parametrizes all input- ‘output pairs, and therefore is the state of the abstract object described by (I.1). Correspondingly, choice of state of the RC network is the output voltage at time fy. Example 13. ‘The physical object shown in Fig, 1-8 is two RC networks in series. The pertinent equation is RAC? Ayldt? + 2BRCdylat + y = w as) R 2 j c sic wt) 1 oT CHAP. 1] MEANING OF STATE 3 with a solution on 9 = MED gete-vrane — gty-vniney + gig Magieteromee — eee-veney aaa S fet PIRRC ft“ OYRC Tul) de as) Here the set of numbers y(tq) and 24(t) parametrizes the input-output pairs, and may be chosen as state, Physically, the voltage and its derivative across the smaller capacitor at time ty correspond to the state, Definition 14: A state variable, denoted by the vector x(t), is the time function whose value at any specified time is the state of the abstract object at that time. Note this difference in going from a set of numbers to a time function. The state can be a set consisting of an infinity of numbers (e.g. Problems 1.1 and 1.2), in which case the state variable is an infinite collection of time functions. However, in most cases considered in this book, the state is a set of n numbers and correspondingly x(t) is an n-vector function of time. Definition 1.5: The state space, denoted by 3, is the set of all x(t). Example 14. The state variable in Example 1.2 is 2(0) = y(t), whereas.in Example 1.1 the state variable remains either A or B for all time. Brame “ The state variable in Example 1.3 is pom) AO ‘The state representation is not unique. There can be many different ways of expressing the relationship of input to output. Example 16, In Example 1.3, instead of the voltage and its derivative across the smaller capacitor, the state could be the voltage and its derivative across the larger capacitor, or the state could be the voltages across both capacitors, ‘There can exist inputs that do not influence the state, and, conversely, there can exist outputs that are not influenced by the state. These cases are called uncontrollable and unobservable, respectively, about which much more will be said in Chapter 6. Example 1. In Example 1.1, the physical object is state uncontrollable. No input can make the switch change positions. However, the switch position is observable. If the wire to the output were broken, it would be unobservable. A state that is both unobservable and uncontrollable makes no physical sense, since it can not be detected by experiment. Examples 1.2 and 1.8 are both controllable and observable. ‘One more point to note is that we consider here only deterministic abstract objects. ‘The problem of obtaining the state of an abstract object in which random processes are inputs, ete., is beyond the scope of this book. Consequently, all statements in the whole book are intended only for deterministic processes. 4 MEANING OF STATE (omar. 1 13 TRAJECTORIES IN STATE SPACE ‘The state variable x(t) is an explicit function of time, but also depends implicitly on the starting time to, the initial state x(fs) =x, and the input u(z). ‘This functional dependency can be written as x(t) = g(t; to,xo, u(r)), called a trajectory. The trajectory can be plotted in n-dimensional state space as ¢ increases from fo, with t an implicit parameter. Often this plot can be made by eliminating t from the solutions to the state equation. Example 18. _ Given y(t) =sint and 2,(t) = cost, squaring each equation and adding gives 2? +23 is a circle in the ery plane with t an implicit parameter, This Example 19. In Example 1.3, note that equation (1-4) depends on t, u(r), x(ty) and ty, where x(t) is the vector with components (fa) and dy/dt(t,). ‘Therefore the trajectories ¢ depend on these quantities. Suppose now u(t) =0 and RC=1. Let 21= y(t) and #= dy/dt. ‘Then dz dzyfdt. Therefore dt = dzy/2, and so d2y/dt? = e, dey/dr, ives zp and dy/dt? = Substituting these relationships into (1.3) sp dzyfdzy + 2.625 +2, = 0 which is independent of t. ‘This has a solution m+ 2uy = OBe tage where the constant C= [2y(te) + 22y(t)]/(22s(t) + 2alt)]*. ‘Typical trajectories in state space are shown in Fig. 1-4, The one passing through points (%)) =0 and alto) =1 is drawn in bold. The arrows Point in the direction of increasing time, and all trajectories eventually reach the origin for this particular stable system. Fig.t-4 14 DYNAMICAL SYSTEMS In the foregoing we have assumed that an abstract object exists, and that sometimes we can find a set of oriented mathematical relationships that summarizes this listing of input and output pairs, Now suppose we are given a set of oriented mathematical relation- ships, do we have an abstract object? The answer to this question is not always affirmative, because there exist mathematical equations whose solutions do not result in abstract objects. Example 1.10. ‘The oriented mathematical equation y(t) = ju(t) cannot give an abstract object, because either the input or the output must be imaginary. If a mathematical relationship always determines a real output y(t) existing for all t® te given any real input u(t) for all time t, then we can form an abstract object. Note that by supposing an input u(t) for all past times as well as future times, we can form an abstract object from the equation for a delayor y(t) = u(t—7). [See Problem 1.1] CHAP. 1] MEANING OF STATE 5 However, we can also form an abstract object from the equation for a predictor u(t)=u(t+T7). If we are to restrict ourselves to mathematical relations that can be mechanized, we must specifically rule out such relations whose present outputs depend on future values of the input. Definition 1. A dynamical system is an oriented mathematical relationship in which: (1) A real output y(t) exists for all t= to given a real input u(t) for all t. (2) Outputs y(t) do not depend on inputs u(r) for +> t. Given that we have a dynamical system relating y(t) to u(t), we would like to construct ‘a set of mathematical relations defining a state x(t). We shall assume that a state space description can be found for the dynamical system of interest satisfying the following conditions (although such a construction may take considerable thought): Condition 1: A real, unique output. y(t) = n(t, $(t; to, xo, u(s)), w(t)) exists for all t > te given the state xo at time to and a real input u(s) for += to Condition 2; A unique trajectory 9(t; to, x0, u(z)) exists for all t > te given the state at time ts and a real input for all t Condition $: A unique trajectory starts from each state, i.e. lim g(t; t, x(t), w(s)) = x(t) for all t= to (1.5) Condition 4: Trajectories satisfy the transition property $(ts to, x(te), w(r)) = O(ts ti, x(t), u(r) for to< tr t. Condition 1 gives the functional relationship y(t) = n(t,x(t),u(t)) between initial state and future input such that a unique output is determined. Therefore, with a proper state space deseription, it is not necessary to know inputs prior to fo, but only the state at time to. ‘The state at the initial time completely summarizes all the past history of the input. Example 1.11. In Example 1.2, it does not matter how the voltage across the capacitor was obtained in the past, All that ia needed to determine the unique future output is the state and the future input, Condition 2 insures that the state at a future time is uniquely determined. Therefore Imowledge of the state at any time, not necessarily te, uniquely determines the output. For a given u(t), one and only one trajectory passes through each point in state space and exists for all finite £= to. As can be verified in Fig. 1-4, one consequence of this is that the state trajectories do not cross one another. Also, notice that condition 2 does not require the state to be real, even though the input and output must be real. ‘The relation dy/d jz(t) ean be constructed satisfying conditions 1: u(t) is obviously a dynamical system. A state space description dz/dt ‘yet the state is imaginary, 6 MEANING OF STATE (cHAP.1 Condition 8 merely requires the state space description to be consistent, in that the starting point of the trajectory should correspond to the initial state. Condition 4 says that the input u(r) takes the system from a state x(to) to a state x(t), and if x(t) is on that trajectory, then the corresponding segment of the input will take the system from x(t) to x(t). Finally, condition 5 has been added to assure causality of the input-output relationship resulting from the state space description to correspond with the causality of the original dynamical system. Example 1.13. ‘We can construct a state space description of equation (1.1) of Example 12 by defining a state z(t) = y(t). Then condition 1 is satisfied as seen by examination of the solution, equation (1.2). Clearly the trajectory @%t: to 20, u(r)) exists and is unique given a specified ty, and 1(7), s0 condition 2 is satisfied. ‘Alto, conditions 8 and 8 are antinfed. To check condition 4, given #(t) = ult) and wi) over ty == t, then A a) = atte nime 4 De FY gre 0RC wey dry as) where ‘ att) = alegre + ef oul dry «eo Substitution of (1.9) into (1.8) gives the previously obtained (1.2). Therefore the dynamical system (1.1) has a state space description satisfying conditions 1-5. Henceforth, instead of “dynamical system with a state space description” we will simply say “system” and the rest will be understood. 15 LINEARITY AND TIME INVARIANCE Definition 1.7: Given any two numbers a, f; two states x(t), xs(ts); two inputs u(s), ua(2);, and two corresponding outputs yi(r), yo(z) for 7™= to. Then a system is linear if (1) the state xs(to) = ox:(to) + Axa(te), the output ys(r) = ayi(r) + Ays(x), and the input us(+) = aui(z) + Sux(*) can appear in the oriented ab- stract object and (2) both ys(x) and xs(z) correspond to the state xs(t) and input (7). * The operators $(t; fo,x0, u(=)) = x(t) and n(ts (ts fo,x0,u(7))) =y(t) are linear on {u(s)} © {x(to)} is an equivalent statement. Example 1.14. In Example 1.2, ni) = ml) = alte + DF ero meagla ar mi = eso = aaerome a Bf escome ast are the corresponding outputs 14(t) and yg(t) to the states z(t) and z(t) with inputs ul) and us). Since any magnitude of voltage is permitted in this idealized system, any state g(t) = ax(t) + Baal), any input 1g(t) = av(t) + Ault), and any output ya(t) = ays(0) + Bus(") will appear in the list of input- ‘output pairs that form the abstract object. Therefore part (1) of Definition 1.7 is satisfied. Furthermore, let's look at the response generated by z(t) and ws). ne = ndenetnnne 4 ZS scomeusde lore) + Beste + ay J 8PM fe) + Bua] dr = a(t) + Bul) = wold) Since ys(t) = s(t), both the future output and state correspond to a(t) and u(t) and the system is linear. CHAP. 1] MEANING OF STATE 7 Example 1.15. Consider the system of Example 1.1. For some a and A there is no state equal to oA +B, where A and B are the switch positions. Consequently the system violates condition (1) of Definition 1.7 and is not linear. Example 116. Given the system de/dt =0, y=ueoss. Then y(t) = mt) coszi(ts) and up(t) = w(t) cos z(t). The state 2a(t) = z(t) = az4(40) + Aalto) and is linear, but the output UD = [aryCt) + frug()] C08 [axslte) + Bayt] * auslt) + Bua) ‘except in special eases like a(t) = alt 6, 0 the system is not linear. If a system is linear, then superposition holds for nonzero u(t) with x(t) =0 and also for nonzero x(to) with u(t)=0 but not both together. In Example 1.14, with zero initial voltage on the capacitor, the response to a biased a-c voltage input (constant + sin ef) could be calculated as the response to a constant voltage input plus the response to an unbiased a-c voltage input. Also, note from Example 1.16 that even if superposition does hold for nonzero u(t) with x(fs) =0 and for nonzero x(ts) with u(t)=0, the system may still not be linear. Definition 1.8: A system is time-invariant if the time axis can be translated and an equiva- ent system results. One test for time-invariance is to compare the original output with the shifted output. First, shift the input time function by T seconds. Starting from the same initial state xo at time to+ 7, does y(t +T) of the shifted system equal y(¢) of the original system? Example 1.17. Given the nonlinesr differential equation with 2(6) where 2(r=0) =a, resulting in the same system. If the nonlinear equation for the state z were changed to = te tae with the substitution += ret + alt Gat and the appearance of the last term on the right gives a different system. Therefore this is a time- varying nonlinear system. Equations with explicit functions of ¢ as coefficients multiplying the state will usually be time-varying. 16 SYSTEMS CONSIDERED ‘This book will consider only time-invariant and time-varying linear dynamical systems described by sets of differential or difference equations of finite order. We shall see in the next chapter that in this case the state variable x(¢) is an n-vector and the system is linear. Example 1.18, ‘A time-varying linear differential system of order n with one input and one output is deseribed by the equation + any = BADGE +--+ + AylOu (110) Oy antes - Ge + alo Geet 8 MEANING OF STATE (cHaP.1 Example 119. the Aolimevaryine linear difference system of order m with one input and one output is described by e equation we +n) + alk) yet m1) +--+ + ag(R UCR) = glk) ule m) +o + pathy utk) —— (Hat) ‘The values of a(t) depend on the step (#) of the process, in a way analogous to which the a(t) depend on tin the previous example, 17 LINEARIZATION OF NONLINEAR SYSTEMS State space techniques are especially applicable to time-varying linear systems. In this section we shall find out why time-varying linear systems are of such practical importance. Comparatively little design of systems is performed from the time-varying point of view at present, but state space methods offer great promise for the future. Consider a set of n nonlinear differential equations of first order: dyldt = filyr, Ye) «+5 YU t) ayldt = falyss ve Yay, t) dya/dt A nonlinear equation of nth order d*y/dt* Fults Was «test 8) atu, ay/at, amty/aee—5, u, t) can always be written in this form by defining yi =y, dy/dt = ya, ..., d*-ty/dt™~ Y Then a set of n first order nonlinear differential equations can be obtained as dy/dt = ys dylat = vs leseeeeees (1.22) dys—s/dt dysldt = (ys, Ys, . 1 Yost t) Example 1.20. To reduce the second order nonlinear differential equation d2y/dt?—2y9+-udyldt = 0 to two first order nonlinear differential equations, define y= y; and dy/dt=y, Then au sat = vy dyg/dt = 2yj — yy Suppose a solution can be found (perhaps by computer) to equations (1.12) for some initial conditions yi(te), va(ts), -.., va(ta) and some input w(t). Denote this solution as the trajectory $(t; w(t), 1s(l0), ..., Ya(te), fo). Suppose now that the initial conditions are changed: ults) = wills) + (te, Hee) = alte) + alte), ear FHC = aula) + alt) where 2;(ta), (te) and za(to) are small. Furthermore, suppose the input is changed slightly to u(t) = w(t) +0(t) where v(t) is small. To satisfy the differential equations, AG, AeA = Abit hy byt ty or GFZ WHY, t) AG, +2 Vdt = Fb, + 2p byt hy +r Gy, + Ly WHY, th AG, +e Mat = Fb, + By by thy or bq +My WHY, CHAP. 1] MEANING OF STATE 9 If fy fy ---» f, cam be expanded about 4, 4, --., 4, and w using Taylor's theorem for sev- eral variables, then neglecting higher order terms we obtain des of, ah ah : Mh Hibs ty vty) + Hie, + Bia to + Ba + Bo Bs des af oft fs fs Tet Ge = falbn dy ver bye Ot) + dee, + dhe, re thee, 4 ay in, 4, ale atm, Ae Be hb dy erty) + ea + ee + Pore Sem + fay ,u,t) with respect to y,, evalu- where each 4f,/ay, is the partial derivative of f,(¥,, Uy since each ¢, satisfies the original ated at ¥, = by Va=$y +++ Uy =o, and u=w. Now, equation, then dg/dt =f, can'be canceled to leave a afdou aflays ... afildua\ (as afafou d| ms afaldys afaldy. ... afsldvn \! we ofs/ou a@\: pean Tey: fn, afeloys afoloys ... afaldva) \ rn, afalou which is, in general, a time-varying linear differential equation, so that the nonlinear equa- tion has been linearized. Note this procedure is valid only for 21, 22, .... 2 and v small enough so that the higher order terms in the Taylor's series can be neg] lected. The matrix of af /ay, evaluated at y, =, is called the Jacobian matrix of the vector fly, u,t). Consider the system of Example 1.20 with initial conditions y(fq)=1 and (ts) att If the particulas input m() = 0, we obtain the trajectories gu(0) = t—! and ga(t) = —f-%. Since f,= ve ‘then af;/@y; = 0, af;/dy, = 1 and af,/au=0. Since fa = 2v} ~wyn, then dfo/dy; = Gy, dfy/éyy = —u snd afglou = vy Hence for initial conditions y(t) = 1+ 24), Hla) =—1 + Hallo) and inputs w= o, wwe obtain £(2) = (2 DE) +(C)> ‘This linear equation gives the solution y(t) = 2(0)+t~1 and dy/dt = 2,—t~® for the original nonlinear ‘equation, and is valid as long as 2), 2, and v are small. ‘Example 1.22 Given the system dy/dt = ky—y?+u, Taking u(t) =0, we ean find two constant solutions #(t) = 0 and (i) =k. ‘The equation for small motions 2(t) about g(t) = 0 is dz/dt = ke-+w so that y(t) ~ #(0, fand the equation for amall motions a(!) about y(t) =k is dz/dt =—ke-+u so that v(t) ~ k+a(t). 10 12. 13. 4. MEANING OF STATE (CHAP: 1 Solved Problems Given a delay line whose output is a voltage input delayed T seconds. What is the Physical object, the abstract object, the state variable and the state space? Also, is it controllable, observable and a dynamical system with a state space description? ‘The physical object is the delay line for an input u(t) and an output y()=u(t— 7). This equation is the abstract object, Given an input time function w(t) for all t, the output y(t) is defined for t= ty, s0 it is a dynamical system. To completely specify the output given only u(t) for t= to, the voltages already inside the delay line must be known. Therefore, the state at time f 8 H(t) = unr.) where the notation 1, ;,) means the time fonction u(r) for + in the Interval t)=1r< 4. For ¢>0 as small as we please, uj,-7,,) ean be considered as the vin- countably infinite set of numbe {ulty— TM), M-TH, cry Wo d = tyr = ato In this sense we can consider the state as consisting of an infinity of numbers. Then the state variable is the infinite sot of time functions a) = teeny = t=, Wt-TH9, 2. ue—9) ‘The state space is the space of all time functions T seconds long, perhaps limited by the breakdown voltage of the delay line. An input u(t) for t=t0 (positive velocity) and decreases for 2 <0, giving the motion of the aystem in the direction of the arrows as t increases. For instance, starting at the initial conditions 1,(t) and y(t) corresponding to the point numbered 1, the system moves along the outer trajectory to the point 6, Similarly point 2 moves to point 6. However, starting at either point 8 oF point 4, the ayatem goes to point 7 where the system motion in the next instant is not determined, At point 7 the output y(t) does not exist for future times, so that this is not a dynamical system. Given the electronic device diagrammed in Fig. 1-6, with a voltage input u(t) and a voltage output y(t). The resistors R have constant values. For t=t ty. Substituting 7,(2) and y,(@, the output is ayi(®) + Buel, thowing that superposition does hold and that the system is linear. ‘The switch S can be consid- ered a time-varying resistor whose resistance is infinite for ¢ <; and zero for t= ty, Therefore Fig. 1-6 depicts a time-varying linear device. 12 16. La MEANING’ OF STATE [cHaP.1 Given the electronic device of Problem 1.5 (Fig. 1-6), with a voltage input u(t) and a voltage output y(t). ‘The resistors R have constant values. However, now the.posi- tion of the switch S depends on y(t). Whenever y(t) is positive, the switch $ is open; and whenever y(t) is negative, the switch S is closed. Is this system linear? Again there is no state, and only superposition for zero state but nonzero input need be investigated. The input-output relationship is now . HO) = Bult) + u(t) sgn wlO12 Where senu=+H1 if u is positive and 1 if w is negative, Given two inputs 1 and ug with resultant outputs y, and yz respectively, an output y with an input 1 ait; + Au, Is expressible as ing) + (aay + Brug) sen (ee, + Bug)]/12 v= (Blox To be linear, ay + Py, must be equal y which would be true only if arty Sema + Bg sia tg = (ea + Bs) sem (atts + a) This equality holds only in special eases, system is not linear. such as sgn ie = sgn uz = sgn (at + fq), #0 that the Given the abstract object characterized by wt) = me + Fels) de Is this time-varying? ‘This abstract object is that of Example 12, with |u(@) RC=1 in equation (1.1). By the same procedure | ed in Example 1.16, it ean also be shown time. y invariant. However, it can also be shown timesinvari- ant by the test given after Definition 18. ‘The input time function u(?) is shifted by 7, to become 4 (). ‘Then as can be seen in Fig. 1-7, aw = we-7) b uit) Starting from the same al state at time f+ 7, rs ¢ ‘ se Original System FO = aan fe euem Tae sina Sy Stet Re) ads FT croton de Bvatuating 9 at « 30 | | +7 gives Ferm = welts fA weae Bt TF which is identical with the output (0) Shifted System Fig.1-7 CHAP. 1) ‘MEANING OF STATE 13. 18. 110, un. 12, 133, 1. 138, 118, at. 138, 118, 1.20, 1a, Supplementary Problems Given the spring-mass system shown in Fig. 1-8. What is the physical object, the abstract object, and the state variable? k ™ Given the hereditary system y(?) = Sf K(t,1) wls) de where LJ K(1) Ss some single-valued continuously differentiable function Z of t and +. What fs the state variable? Is the system linear? Is the system time-varying? Fig.18 Given the diserete time system o(n +1) = x(n) +u(n), the series of inputs u(0), u(t), ..., alk), « and the state at step 8, 2(3). Find the state variable a(m) at any atep m= 0. ‘An abstract object is characterized by u(t) = u(t) for ty = t< ty, and by dy/dt = du/dt for t= Itis given that this abstract object will permit discontinuities in y(t) at t,. What is the dimens fof the atate space for ty=¢ ty, so that as stated the equation is not a dynamical system. However, if the behavior of engineering interest lies between tj and ¢,, merely append y= Ou for >t to the equation and a dynamical system results, Chapter 2 Methods for Obtaining the State Equations 21 FLOW DIAGRAMS Flow diagrams are a simple diagrammatical means of obtaining the state equations. Because only linear differential or difference equations are considered here, only four basic objects are needed. ‘The utility of flow diagrams results from the fact that no differenti- ating devices are permitted. Definition 2.1: A summer is a diagrammatical abstract object having n inputs u(t), w(t), +»ta(t) and one output y(t) that obey the relationship u(t) = Su(t) = w(t) # +++ walt where the sign is positive or negative as indicated in Fig. 2-1, for example. EXO} u(t) Fig.2-1. Summer Definition 2.2: A scalor is a diagrammatical abstract object having one input u(t) and one output y(¢) such that the input is scaled up or down by the time function a(?) as indicated in Fig. 2-2. The output obeys the relationship y(t) = a(t) u(t). u(t) 0) ———. ue), Fig.2-2, Sealor Definition 23: An integrator is a diagrammatical abstract object having one input u(t), one output y(t), and perhaps an initial condition y(to) which may be shown or not, as in Fig. 2-8. The output obeys the relationship wt) = ult) + Sf 1u(s) de wed u(t) wt) Fig.2-3, Integrator at Time ¢ 16 CHAP. 2} METHODS FOR OBTAINING THE STATE EQUATIONS 1 Definition 24: A delayor is a diagrammatical abstract object having one input we), one output (it), and perhaps an initial condition y(1) which may be shown or not, as in Fig. 2-4. The output obeys the relationship WG+l+1) = uG+) for j= 0,1,2, ut) wh) wie) 2-4, Delayor at Time & 22 PROPERTIES OF FLOW DIAGRAMS Any set of time-varying or time-invariant linear differential or difference equations of the form (1.10) or (1.11) can be represented diagrammatically by an interconnection of the foregoing elements. Also, any given transfer funetion can also be represented merely by rewriting it in terms of (1.10) or (1.11). Furthermore, multiple input and/or multiple output systems can be represented in an analogous manner. Equivalent interconnections can be made to represent the same system. Example 21. Given the system aylat = ay + ou en) with initial condition y(t,). An interconnection for this system is shown in Fig. 2.5. ut) uo) Ko) Fig. 25 ” Since a is a constant function of time, the integrator and scalor can be interchanged if the initial condi- tion is adjusted accordingly, as shown in Fig. 2-6, puttodle uo) Fig.2-6 ‘This interchange could not be done if a were a general function of time. In certain special eases it is possible to use integration by parts to accomplish this interchange. If a(t) =, then (2.1) ean be inte- grated to HO = alte +f rlue) + win dr 18 METHODS FOR OBTAINING THE STATE EQUATIONS (CHAP. 2 Using integration by parts, A) = alt) ~ [wo + woh at ar + # f(y) + ute] ar no = nto ~ Sf" J, 1 ‘which gives the alternate flow diagram shown in Fig. 27. putt «@—C) > Wd) Fig.2-7 Integrators are used in continuous time systems, delayors in discrete time (sampled data) systems. Discrete time diagrams can be drawn by considering the analogous con- tinuous time system, and vice versa. For time-invariant systems, the diagrams are almost identical, but the situation is not so easy for time-varying systems. Example 22. Given the diserete time system vUk+I+1) = aye) + aulk+D (22) with initial condition y(0). The analogous continuous time systems is equation (2.1), where d/dt takes the place of a unit advance in time, ‘This is more evident by taking the Laplace transform of (2.1), a¥(@) — lle) = a¥(e) + abt) and the transform of (2.2), 2¥(@) — sy) = a¥(@) + Vl) Hence from Fig. 2-5 the diagram for (2.2) can be drawn immediately as in Fig. 2-8. wt) wk) we+y Fig.28 If the initial condition of the integrator or delayor is arbitrary, the output of that integrator or delayor can be taken to be a state variable. Example 23. The state variable for (2.1) is v(), the output of the integrator. To verify this, the solution to equa tion (21) i t) = weet“ a fe ls) de ut) = ult) Sf Note u(t) is the state at to so the state variable is y(0)- Example 24. ‘The state variable for equation (2.2) is y(+1, the output of the delayor. ‘This ean be verified manner similar to the previous example. CHAP. 2] METHODS FOR OBTAINING THE STATE EQUATIONS 19 Example 25. From Fig. 2-7, the state is the output of the second integrator only, because the initial condition of the first integrator is specified to be zero. This is true because Fig. 2-7 and Fig. 2-5 are equivalent systems. Example 26. ‘A summer or a scalor has no state associated with it, because the output is completely determined by ‘the input. 2.3 CANONICAL FLOW DIAGRAMS FOR TIME-INVARIANT SYSTEMS Consider a general time-invariant linear differential equation with one input and one output, with the letter p denoting the time derivative d/dt. Only the differential equations need be considered, because by Section 2.2 discrete time systems follow analogously. BY tay tba, PY a = RRM PEt +B, pUt BM (28) This can be rewritten as BY — Bt) + Pay — Byt) +--+ PLO: — By) + ay ~ Bt = 0 because ajp"~'y = p*~‘ay, which is not true if «, depends on time. Dividing through by p* and rearranging gives (8, —¢— aya) + FaBan— 9,0) (24) Y= A+ RAM aW) + tS from which the flow diagram shown in Fig. 2-9 can be drawn starting with the output y at the right and working to the left. wt) Fig.2-9, Flow Diagram of the First Canonical Form The output of each integrator is labeled as a state variable. The summer equations for the state variables have the form ya tou = ay +a, + Bu say + 2, + Bye (2.5) Be = a Ut Rt By = ay + Bu 20 METHODS FOR OBTAINING THE STATE EQUATIONS [cHAP. 2 Using the first equation in (2.5) to eliminate y, the differential equations for the state vari- ables can be written in the canonical matrix form a —a, 1 a By «Bo al™ “4 0 cA By — anf a: = : : d+ : ju (2.6) ant a 0 Fant — 4,18 Ea -« 0 ts Be ah We will call this the first canonical form. Note the 1s above the diagonal and the a's down the first column of the n xn matrix. Also, the output can be written in terms of the state vector es m y= 10...00/ 2 | + au en oe oe Note this form can be written down directly from the original equation (2.3). Another useful form can be obtained by turning the first canonical flow diagram “back- wards.” This change is accomplished by reversing all arrows and integrators, interchang- ing summers and connection points, and interchanging input and output. This is a heuristic method of deriving a specific form that will be developed further in Chapter 7. 210, Flow Diagram of the Second Canonical (Phase-variable) Form Here the output of each integrator has been relabeled. The equations for the state variables are now (es) ya, FU CHAP. 2] METHODS FOR OBTAINING THE STATE EQUATIONS 21 In matrix form, (2.8) may be written as Ea 0 1 0 vee 0 ty 0 al ® 0 0 6 0 ms 0 a: = sees oda rt fe (2.9) tn 0 0 0 . 1 tama 0 fe an ~a,/ \ 1 Pn % and y= —ap)| | + Au (e.10) 2 This will be called the second canonical form, or phase-variable canonical form, Here the Is are above the diagonal but the a's go across the bottom row of the nxn matrix. By eliminating the state variables x, the general input-output relation (2.8) can be verified. ‘The phase-variable canonical form can also be written down upon inspection of the original differential equation (2.3). 24 JORDAN FLOW DIAGRAM The general time-invariant linear differential equation (2.8) for one input and one out- put can be written as Bap + Bp + oF apPtys (2.11) By dividing once by the denominator, this becomes y= e8,)0"-* + (By— ef) A) Bt ap" Consider first the case where the denominator polynomial factors into distinct poles \ ,2,...,% Distinct means u+d; for ix J, that is, no repeated roots. Because most practical systems are stable, the 4; usually have negative real parts. Pt apt bay Day = PANP-A)--(P-A) (es) ‘A partial fraction expansion can now be made having the form Ps Ps y= putgtyut gyudon t (2.14) Here the residue p, can be calculated as eB)AT* + (By~ ayB at? + n= BON + (By~ 8) (21 QA DA Are) ALMA) ‘The partial fraction expansion (2.14) gives a very simple flow diagram, shown in Fig. 2-11 following. 22 METHODS FOR OBTAINING THE STATE EQUATIONS [oHar. 2 “a we) Fig.2-11, Jordan Flow Diagram for Distinet Roots Note that because p, and A, can be complex numbers, the states 2, are complex-valued functions of time. The state equations assume the simple form a = Ate Dutt, be (2.16) dnt, + Y= BoM + Py + pty + +++ + pnt, Consider now the general case. For simplicity, only one multiple root (actually one Jordan block, see Section 4.4, page 73) will be considered, because the results are easily extended to the general case. Then the denominator in (2.12) factors to Dr bap + + tay ta, = (PA) (P—Aver)- (P= An) (2.17) instead of (2.13). Here there are » identical roots. Performing the partial fraction expan- sion for this case gives = pyle Yet ys bu + ony * @ pom PS) The residues at the multiple roots can be evaluated as 1 pant = bf pay ae . a = gpleereom). = (2.19) where /(p) is the polynomial fraction in p from (2.12). This gives the flow diagram shown in Fig. 2-12 following. CHAP. 2] METHODS FOR OBTAINING THE STATE EQUATIONS 28 “wo Fi 242. Jordan Flow Diagram with One Multiple Root The state equations are then fi = Ati +a fy = Auta Bey = Meter + ty a = ay te (2.20) ova = Metin tu Datta + 16 Bott + patty + Pata + +++ + Paty 24 METHODS FOR OBTAINING THE STATE EQUATIONS [cHAP. 2 ‘The matrix differential equations associated with this Jordan form are a M10... 0.00... 0) fa ° a Oat... 000... 0\/a ° ajar} _ | 000... 410... 0 | an 0 a@lx |= |000.. 00... 0 |] w 1 |¥ @2n me 000 1. 0 Orv ee. 0 ff aor} | o fn 000... 000 raf \eu 1 a m and v= Gene od 2] + Bau (e22) 2m In the n Xn matrix, there is a diagonal row of ones above each A: on the diagonal, and then the other 2's follow on the diagonal. Example 27. Derive the Jordan form of the differential system F+Mry = ata (223) uation (228) cn be writen any = (22h wine pst faction expansion gies 1 = tet he v= grip t pei Figure 2-18 is then the flow diagram. Fig. 213 Because the scalor following 2; is zero, this state is unobservable. ‘The matrix state equations in om") = GQ) + 0)» v= 0 »(2) CHAP, 2] METHODS FOR OBTAINING THE STATE EQUATIONS 25 25 TIME-VARYING SYSTEMS Finding the state equations of a time-varying system is not as easy as for time-invariant systems. However, the procedure is somewhat analogous, and so only one method will be given here. ‘The general time-varying differential equation of order n with one input and one output is shown again for convenience. @ an a Fe + (0 Se + + ay = A a +o + Btu (1.10) Differentiability of the coefficients a suitable number of times is assumed. Proceeding in a manner somewhat similar to the second canonical form, we shall try defining states as in (2.8). However, an amount y((t) [to be determined] of the input u(t) enters into all the states, a, + (tu = +(e (2.24) Bt yy (tu ~a,(2, — a(t), a(t), + y,(Qe y= By differentiating y n times and using the relations for each state, each of the unknown y, a+ you can be found. In the general case, nf) = Balt) hha agg AX WO = AO ~ BROAD. ig nO (225) Example 28 ‘Consider the scond order equation F2 + aE + ext = BOGE + BOGE + plu (226) Then by (2.28) ve at nloe ean and diterentiting, b= Ft Foltut yoldic (2.28) Substituting the first relation of (2.24) into (2.28) gives I= met inl tines toe 20) Differentiating again, a By + (Flt) + Poll] + [rat + 2D} + yolt) (2.80) From (2.24) we ave een Ey = ay(thxg — a(x, + yall (2.91) ‘Now substituting (2.27), (2.29) and (2.90) into (2.87) yields UH [0 + Fol} — [ral + Bol] — volt iE = mas — (7a) + Fal} — oh) ~ el t)lv— volte] + ratte (22) 26 METHODS FOR OBTAINING THE STATE EQUATIONS [omar. 2 Equating coefficients in (2.26) and (2.82), at ht Tot Ont ton + Yo = Be (3) nt Bieta = ony = Po (235) Substituting (235) into (2.3), . nh = Pi~ePo~ 2h 30) and putting (2.85) and (2.86) into (2.83), Ya = By — Bi + Pa + (asBo+ 2Bo— Bilas + Babs — Bors (e87) Using equation (2.24), the matrix state equations become a ° 1 0 0 \ fe nf) Alle 0 0 1 0 y v(t) al: =f. F + i ju (2.88) ae 0 0 0 aoe Bay Yat) *, HH) ei) —ay-30) oy nf : 5 oy y = (10... of : + y(tu Bay 4 26 GENERAL STATE EQUATIONS ‘Multiple input-multiple output systems can be put in the same canonical forms as single input-single output systems. Due to complexity of notation, they will not be considered here. ‘The input becomes a vector u(t) and the output a vector y(t). The components are the inputs and outputs, respectively. Inspection of matrix equations (2.6), (2.9), (221) and (2.88) indicates a similarity of form. Accordingly a general form for the state equations of a linear differential system of order n with m inputs and k outputs is dx/dt = A(x +BOu y = C(x + Diu (2.89) where x(t) is an n-vector, u(t) is an m-veetor, y(t) is a k-vector, A(t) is an nXn matrix, Bit) is an nx m matrix, Cif) is a kxn matrix, D(t) is a kx m matrix. Ina similar manner a general form for discrete time systems is x(n+1) = A(n)x(n) + B(n) u(n) ¥(a) = Cln) x(n) + Dn) w(n) (2.40) where the dimensions are also similar to the continuous time case. CHAP. 2] METHODS FOR OBTAINING THE STATE EQUATIONS 27 Specifically, if the system has only one input u and one output y, the differential equa- tions for the system are dx/dt = A(t)x + b(t)u y = et(t)x + d(tu and similarly for discrete time systems. Here e(t) is taken to be a column veetor, and et(f) denotes the complex conjugate transpose of the column vector. Hence et(t) is a row vector, and et(t)x is a scalar. Also, since u, y and d(t) are not boldface, they are scalars. Since these state equations are matrix equations, to analyze their properties a knowledge of matrix analysis is needed before progressing further. Solved Problems 21. Find the matrix state equations in the first canonical form for the linear time- invariant differential equation H+ 5) + by = tu (2.41) with initial conditions 4(0) = yo, (0) =o. Also find the initial conditions on the state variables. Using p=d/dt, equation (2.42) can be written as p%y-+5py+6y = putu, Dividing by p and rearranging, = Lensy + Au v= pf Sy) + 5a by) ‘The flow diagram of Fig. 2-14 can be drawn starting from the output at the right. Fig. 244 Next, the outputs of the integrators are labeled the state variables 2; and x, as shown. Now an equation can be formed using the summer on the le fy = Oy tu Similarly, an equation can be formed using the summer on the right: by tu ‘Also, the output equation is y= e,. Substitution of this back into the previous equations gives bay tate by tu (ea) 28 METHODS FOR OBTAINING THE STATE EQUATIONS (cHaP. 2 ‘The sate eauatons can then be writen fn mate notation as a(* 5 a(n) , (2) a) - GE) ++ v= ao(Z) ‘The initial conditions on the state variables must be related to yp and ij, the given output initial conditions. The output equation is z(@) = v(0), so that 2(0) = y(0) = vp, Also, substituting ult) = 2y(t) into (2.42) and setting ¢=0 gives 540) = —Sy(0) + x2(0) + (0) with the output equation Use of the given initial conditions determi (0) = iy + Sue — 010) ‘These relationships for the initial conditions ean also be obtained by referring to the flow diagram at time t= 0. 22, Find the matrix state equations in the second canonical form for the equation (2.41) of Problem 2.1, and the initial conditions on the state variables. ‘The flow diagram (Fig. 2-14) of the previous problem is turned “backwards” to get the flow diagram of Fig. 2-15. Fig.215 ‘The outputs of the integrators are labeled z, and x2 as shown, These state variables are dif- ferent from those in Problem 2.1, but are also denoted 2 and z, to keep the state vector x(t) nota- tion, as is conventional. Then looking at the summers gives the equations vrata (24s) hy = 6-5 tu ea) Furthermore, the input to the left integrator kom (ea) ‘This gives the state equations &G) = (4 =)() +G)= nd v= an(2) The etl condone ar foond sing 249, me = 504 #0) ean and its derivative . ho = 2,40) + 2210) CHAP. 2] METHODS FOR OBTAINING THE STATE EQUATIONS 29 ‘Use of (2.64) and (2.65) then gives iy = (0) ~ 62,(0) — 6240) + w(0) (ean Equations (2.46) and (2.47) can be solved for the initial conditions (0) = —2v—~ fie + 4m) #400) = 840+ $¥o— $0 23, Find the matrix state equations in Jordan canonical form for equation (2.41) of Prob- lem 2.1, and the initial conditions on the state variables. ‘The transfer function is pti ptt us Pree FAG FH™ A partial fraction expansion gives = 2 ¥ pre” ss pts” From this the flow diagram can be drawn: oO oy Fie.216 ‘The state equations can then be waitten from the equalities at each summer a(2) - CG 3))+@ v= cra(t) From the output equation and its derivative at time to Yo = 2x0) — 24(0) Ho = 2,0) — 24(0) ‘The state equation is used to eliminate £4(0) and 44(0) io = 2ey(0) — 6ay(0) + (0) Solving these equations for (0) and 2,(0) gives (0) = u(0) ~ 840 — do 2x0) = $u(0)~2u0— io) 30 METHODS FOR OBTAINING THE STATE EQUATIONS [cHAP. 2 24. Given the state equations aim) 2. (9 27a) 4(0 a(2) > (-6-3)(2) +) _ ‘aa v= an) Find the differential equation relating the input to the output. In operator notation, the state equations are pay = ay pe, = —62,— 52, bu ysute Eliminating e, and 2p then gives Py + Spy + 6y Given the feedback system of Fig, 2-17 find a state space representation of this 2.5. closed loop system. co) Ro). oo AG Fig.2417 ‘The transfer function diagram is almost in flow diagram form already. Using the Jordan. canonical form for the plant G(e) and the feedback H(s) separately gives the gram of Fig. 218, ett) Note G(e) in Jordan form is enclosed by the dashed lines. Similarly the part for H(@) was drawn, and then the transfer function diagram is used to connect the parts, From the flow diagram, a@)- GA)E)+G)o = « of) x -3)\z, CHAP. 2} METHODS FOR OBTAINING THE STATE EQUATIONS 31 26. Given the linear, time-invariant, multiple input-multiple output, discrete-time system y(n +2) + ay,(n+1) + @Y,(n) + y(R +1) + y,y,(n) = Byun) + 8,u,(n) Un +1) + 7U(n) + agy(n +1) + ayy,(n) = Byu,(n) + 8,u,(n) Put this in the form x(n+1) = Ax(n) + Bun) y(n) = x(n) + Du(n) where y(n) = (iis)> u(r) = (i). ‘The first canonical form will be used. Putting the given input-output equations into z opera: tions (analogous to p operations of continuous time system), Bayt ovens + envy + Yee + Yale = Buty + By fun t rave taseus + aay = fatty + Baty by st ands respectively and solving for yy and yo, 1 n= Fan rad + tpn + 8yua— avs — rad) 1 me = mean + p00 + bate rave ean) Starting from the right, the flow diagram can be drawn as shown in Fig, 2-19, Fig.219 Any more than three delayors with arbitrary initial conditions are not needed because a {earth such delayor would result in an unobservable or unrontrollable state. From this diegram the state equations are found to be 32. METHODS FOR OBTAINING THE STATE EQUATIONS (CHAP. 2 fain) ntren te \/a0) 0 0 iq (ae23) mat nes —1 (8 + (am (2) Been] 7 (Sua ee oh oA) 7 eh ate (om) = (a2 aC) (0), 2.7, Write the matrix state equations for a general time-varying second order discrete time equation, i.e. find matrices A(n), B(x), C(n), D(n) such that x(n+1) = A(n) x(n) + B(n) u(r) y(n) = C(n)x(n) + Din) a(n) (2.48) given the discrete time equation ulm +2) + a,(n) y(n +1) + y(n) y(n) = Bln) u(n +2) + y(n) u(n +1) + B,(m)u(m) (2.49) Analogously with the continuous time equations, try z(t) = zaln) + yaln) ulm) (2.50) zg(n +1) = —ay(n) aan) — a(n) 24(n) + yan) ulm (en) won) = ayn) + rel) win) (2.52) Stepping (2.52) up one and substituting (2.50) gives Ulm +A) = alm) + yl) a(n) + rol +1) un + 1) (2.58) Stepping (2.53) up one and substituting (2.52) yields e+ 2) = —ay(n) y(n) ~ al) y(n) + Yelm) ula) + yal + 1) (n+ 1) + yoln +2) (n+) (2.58) ubstituting (2.52), (2.98), (2.64) into (2.49) and equating coefficients of u(n), u(n-+ 1) and win +2) ol) = Boln—2) 0) = Blm—1) ~ ay(n) Aol 2) Yale) = Balm) ~ axln) By(m—1) + [aylmay(n 1) ~ agln)] agin — 2) In matrix form this is (a2) = (4 2 en) ni) (aan) = (-aginy —ayeny)\ extn, + (2) a 200) a) vo) = 0 (2) + wont 28. Given the time-varying second order continuous time equation with zero input, T+a(t+a(ty = 0 (2.55) write this in a form corresponding to the first canonical form (%) 2 (malt) 1a, \B/ (malt) ole: (2.56) v= (x 20) (2) (257 CHAP. 2] METHODS FOR OBTAINING THE STATE EQUATIONS 33 ‘To do this, differentiate (2.57) and substitute for £, and #, from (2.56), i Sra yamades + Ort dee (2.58) Differentiating again and substituting for 2, and x, a8 before, Gs Berta — Sas — fava ~ 2ante — ars + avagye + afr + (2h — ary aavat Yoder (2.59) Substituting (2.57), (2.58) and (2.59) into (2.55) and equating coefficients of #, and zz gives the equations Tr- ah + @—an = 0 Fa + (os Zag) + (0 asm + ag— fava + 241 ears = 0 In this ease, 74(0) may be taken to be zero, and any non-trivial 72(0) satisfying Yo + (ay 2as)in + (o§ — axa + ay iar = 0 (2.60) will give the desired canonical form. ‘This problem illustrates the utility of the given time-varying form (§ Tt may always be found by differentiating known functions of time. Other forms usually involve the solution of equations such as (2.60), which may he quite difficult, or require differentiation of the a(?). Addition of an input contributes even more difficulty. However, in a later chapter forms analogous to the first canonical form will be given, Supplementary Problems 29. Given the discrete time equation y(n +2) + Sy(n+1) +2u(n) = u(n+1) + Buln), find the matrix state equations in (j) the first canonical form, (ii) the second eanonical form, (ii) the Jordan canon- eal form, 210, Find a matrix state equation for the multiple input—multiple output system jy + ays + ants + Yale + Yate = Batty + 3y0y ja + yatta + Yale + oaths + ea, = Bam + Bata 211, Write the matrix state equations for the system of Fi - 2-20, using Fig, 2-20 directly. os La o La +, oo L* “? Pe a P ? ¥ Fig. 2-20 212, Consider the plant dynamics and feedback com- aa BOF a BT pensation illustrated in Fig. 2-21. Assuming the initial conditions are specified in the form (0), 40), ¥(0), (0), w(0) and HO}, write a state space equation for the plant plus compen- sation in the form X= Ax-+Bu and show the relation between the specified initial conditions and the components of x(0). 84 213, 214, 215, 216. 2at. 218, METHODS FOR OBTAINING THE STATE EQUATIONS (CHAP. 2 ‘The procedure of turning “backwards” the flow diagram of the first canonical form to obtain the second canonical form was never Justified. Do this by verifying that equations (2.8) satisfy the original input-output relationship (2.3). Why can’t the time-varying flow diagram corresponding to equations (2.24) be turned backwards to get another canonical form? Given the linear Fauld + axlty = Bolt) + Ali + Balthu ions u(0) = vo, H(0) = ie (i) Draw the flow diagram and indicate the state variables, ‘with the initial condi i) Indicate on the flow diagram the values of the scalors at all times, (iii) Write down the initial conditions of the state variables Verity equation (2.37) using equation (2.25). ‘The simplified equations of a d-c motor are a Py Motor Armature: i+ it = v~ x,t oe, pt Inertia Loads x’ = Es wt Obtain a matrix state equation relating the input voltage V to the output shaft angle # using a ate velar ‘ a ara) ‘he equations deserting the time behavior of the neutrons ina nuclear reactor are Prompt Neutrons: Th = Gl) + 3 Ci Delayed Neutrons: where = (and ol is the time-varying reactivity, perhaps induced hy control rod motion Write the matrix state equations Bin — NO; Assume the simplified equations of motion for a missile are given by Lateral Translation: 2+ Kys-+ Kya + Kyp = 0 Rotation: $+ Kut Kp = 0 Angle of Attack: a = 9— Keb Rigid Body Bending Moment: M(l) = Kalle + Kyi? where <= lateral translation, @ = attitude, a = angle of attack, @ = engine deflection, M()) = bending moment at statfon 1. | Obtain a state space equation in the form ax . = axt Bu, y = Cx +Du a in tor Xx \ wee() wee / @ x= Seas CHAP. 2 METHODS FOR OBTAINING THE STATE EQUATIONS 35 219, Consider the time-varying electrical network of Fig. 2-22. The voltages across the inductors and the eurrent in the capacitors ean be expressed by the relations _a@ dig | dy 4 = Fad = G+ age =A dey dey ana = Fe = agit age _4 = nifty «te ae = fay = ogre? It is more convenient to take as states the inductor fluxes r= S wrenat + nity a= fi temendae + alto) a = f a Obtain a state space equation for the network in the form and the capacitor charge it + ant) ax = Aux + Bow x) = Cx + Diu where the state vector x, input vector u, and output vector y are 6 «-(8) a@) o-(2) Ps, 7 -G) -@ -() 220, Given the quadratic Hamiltonian H = 4q™Vq+4p"p where q is a vector of » generalized co- ordinates, p is a vector of corresponding conjugate momenta, V and T are "Xn matrices corre: sponding to the kinetic and potential energy, and the superscript T on a vector denotes transpos ‘Write a sot of matrix state equations to describe the state, Answers to Supplementary Problems so Son +(2)aims vod = a Ont ® 2)am + (8a v0) = @ 9x09 oi ai) + (2a vod = C2 ae 36 240, aan. 212, 243, 214 METHODS FOR OBTAINING THE STATE EQUATIONS [cHAP. 2 Similar to first canonical form am 1-1 0 a ° zs = [72 O-nm 0 Py ee (os aje + (0c) non 8, \n / _ fo 0 oF u 0 0 1 0/* or similar to second canonieal form 0 1 0 0 0 0 (nu -n\ (a a) \ o 0 OL 4 nn on 1 o 00 7 oo 1 0)* Je 1 ais We) Xe AS) * NG) y= G00(% conditions result immediately and 100 1 1 0 ~100 ~2, ° ‘The time-varying flow diagram cannot be turned backwards to obtain another eanonical form be- cause the order of multiplication by a time-varying coefficient and an integration cannot he inter- changed. @ CHAP. 2] METHODS FOR OBTAINING THE STATE EQUATIONS 37 (i) The values of yo, 7; and ve are given by equations (2.88), (2.84) and (2.35). Gi) 2400) = Yo ~ yol0) (0) 29(0) = Ho — CFel0) + r4(0)) ul0) — yol0) 4(0) afi RL) 0 ~K,L-* i L i 2. ${ 6 } = o 0 4 o)+{o jv 6 = @10f o alae, Kd-1 9 —Ba-1 / \aalat, ° ofa, . )-A-A dy ne /n afa\ | ao! o\fe aut. 2(%) = : G, os ot o 0 ae ae [0 Hike ~U+K) 0 o 0 on o KK KO. 0 -K 1 0 ° (0 mk #8) = (e) 0 -K, “i 1) B o ° HK, 0 >= (x) ° ° ‘i Kk, Kiky 1 Kk, made (tie ele weet de 7 o-K 0 “Ky 218. @) ® ° 1) 220, ‘The equations of motion are + pa)?, the matrix state equations are ee Using the state vector x = (qy -. Chapter 3 Elementary Matrix Theory 81 INTRODUCTION This and the following chapter present the minimum amount of matrix theory needed to comprehend the material in the rest of the book, It is recommended that even those well versed in matrix theory glance over the text of these next two chapters, if for no other reason than to become familiar with the notation and philosophy. For those not so well versed, the presentation is terse and oriented towards later use. For a more comprehensive presentation of matrix theory, we suggest study of textbooks solely concerned with the subject. 3.2 BASIC DEFINITIONS Definition 3.1: A matriz, denoted by a capital boldfaced letter, such as A or ®, or by the notation (a) or (a), is a rectangular array of elements that are members of a field or ring, such as real numbers, complex numbers, polynomials, functions, ete. The kind of element will usually be clear from the context. o2 3 an ay a) (2b ais) = (SE) = oe Definition 32: A row of a matrix is the set of all elements through which one horizontal line can be drawn. Example 3.1. An example of a matrix is A Definition 3.3: A column of a matrix is the set of all elements through which one vertical line can be drawn. Example 3.2. ‘The rows of the matrix of Example 3.1 are (0 2 j) and (r 2% sin#). The columns of this matrix are 0) /2\ af 5 ) (3) ana . () (ae) 9° (sine Definition 34: A square matriz has the same number of rows and columns. Definition 3.5: ‘The order of a matrix is mn, read m by n, if it has m rows and n columns. Definition 3.6: A scalar, denoted by a letter that is not in boldface type, is a 1x 1 matrix. In other words, it is one element. When part of a matrix A, the notation ay, means the particular element in the ith row and jth column. Definition 3.7: A vector, denoted by a lower case boldfaced letter, such as a, or with its contents displayed in braces, such as {a:), is a matrix with only one row or only one column, Usually a denotes a column vector, and a” a row vector. 38 CHAP. 3] ELEMENTARY MATRIX THEORY 39 Definition 3.8: ‘The diagonal of a square matrix is the set of all elements ay of the matrix in which i In other words, it is the set of all elements of a square matrix through which can pass a diagonal line drawn from the upper left hand corner to the lower right hand corner. Example 33. Given the matrix Sty ae B= 0) = | ba dby boy bx ba boy > jonal is the set of elements through which the solid line is drawn, biz, bao, bys, and not those sets determined by the dashed lines, Definition 3.9: The trace of square matrix A, denoted trA, is the sum of all elements on the diagonal of A. tra 83 BASIC OPERATIONS Definition 3.10: Two matrices are equal if their corresponding elements are equal. A=B means ay = by for all i and j. The matrices must be of the same order. Definition 3.11: Matrix addition is performed by adding corresponding elements. A+B= C means a; +by = cy, for all i and j. Matrix subtraction is analogously defined, The matrices must be of the same order. Definition 3.12: Matrix differentiation or integration means differentiate or integrate each element, since differentiation and integration are merely limiting opera- tions on sums, which have been defined by Definition 3.11, Addition of matrices is associative and commutative, ie. A+(B+C)=(A+B)+C and A+B=B-+A. In this and all the foregoing respects, the operations on matrices have been natural extensions of the corresponding operations on scalars and vectors. However, recall there is no multiplication of two vectors, and the closest approximations are the dot (scalar) product and the cross product. Matrix multiplication is an extension of the dot product a+b = dibi+dsb2+ +++ +aqbs, a scalar. Definition 3.13: To perform matrix multiplication, the element cy of the product matrix C is found by taking the dot product of the ith row of the left matrix A and the jth column of the right matrix B, where C= AB, so that ey = > and. Note that this definition requires the left matrix (A) to have the same number of columns ‘as the right matrix (B) has rows. In this case the matrices A and B are said to be compatible. It is undefined in other cases, excepting when one of the matrices is 11, i.e. a sealar. In this case each of the elements is multiplied by the scalar, e.g. aB = {aby} for all i and j. Example 34. ‘The vector equation y = Ax, when y and x are 2X1 matrices, ie, column vectors, is, Ge) = Gi S)@) 40 ELEMENTARY MATRIX THEORY (cHaP. 3 where u= Baus, for i= 1end2 Butsuppore x= By sothat a, = 3 hye, Then” n= Zaa(Z hs) = Zens so that y = A(Bz) = Cz, where AB = C. Example 8.4 can be extended to show (AB)C = A(BC), i.e. matrix multiplication is asso- ciative. But, in general, matrix multiplication is not commutative, AB~BA. Also, there is no matrix division. Example 35. To show AB BA, consider == (0909-69 m= G90) and note DC. Furthermore, to show there is no division, consider we Gas) = GQ - 6 Since BA =BF =, “division” by B would imply A=, which is certainly not true. Suppose we have the vector equation Ax = Bx, where A and B are nxn matrices. It can be concluded that A=B only if x is an arbitrary n-vector. For, if x is arbitrary, we may choose x successively as e:,e:,...,e, and find that the column vectors a= bi, a2= bs, ...,an= by. Here e; are unit vectors, defined after Definition 3.17, page 41. To partition a matrix, draw a vertical and/or horizontal line between two rows or columns and consider the subset of elements formed as individual matrices, called submatrices. Definition 8.1 As long as the submatrices are compatible, ie. have the correct order so that addition and multiplication are possible, the submatrices can be treated as elements in the basic operations. Example 26. ASX matrix A can be partitioned into a 2X2 matrix Ay, a 1X2 matrix Ayy, a 2X1 matrix Aye, and a 1X1 matrix Ap. eu au jew \ en A= [ an Qn | as | = } ) Rul An) A similacly partitioned 8% matrix B adds as Au +Bu | An + Ba APB OS (RoBi T An? Ba _ AyBy + AyBa | ArnBis + ArBro ABS Radi AaBa | AaB? ABs Facility with partitioned matrix operations will often save time and give insight. and multiplies as CHAP. 3] ELEMENTARY MATRIX THEORY a1 34 SPECIAL MATRICES Definition 3.15: The zero matrix, denoted 0, is a matrix whose elements are all zeros. Definition 3.16: A diagonal matrix is a square matrix whose off-diagonal elements are all zeros. Definition 3.17: The unit matrix, denoted I, is a diagonal matrix whose diagonal elements are all ones. Sometimes the order of I is indicated by a subscript, e.g. I, is ann Xn matrix. Unit vectors, denoted e, have a one as the ith element and all other elements zero, so that I (erler| -.- [en)- Note IA= AI=A, where A is any compatible matrix. Definition 3.18: An upper triangular matrix has all zeros below the diagonal, and a lower triangular matrix has all zeros above the diagonal. ‘The diagonal elements need not be zero. ‘An upper triangular matrix added to or multiplied by an upper triangular matrix results in an upper triangular matrix, and similarly for lower triangular matrices. Definition 3.19: A transpose matrix, denoted A‘, is the matrix resulting from an inter- change of rows and columns of a given matrix A. If A= (a), then AT = {a3}, so that the element in the ith row and jth column of A becomes the element in the jth row and ith column of A’. Definition 320: The complex conjugate transpose matrix, denoted At, is the matrix whose elements are complex conjugates of the elements of AT. Note (AB)’ = BTA? and (AB)t = BtAt. Definition 3.21: A matrix A is symmetric if A= A". Definition 3.22: A matrix A is Hermitian if A= At. Definition 3.23: A matrix A is normal if AtA = AA. Definition 3.24: A matrix A is skew-symmetric if A Note that for the different cases: Hermitian A = At, skew Hermitian A= —At, real symmetric AT=A, real skew-symmetric A=—A", unitary AtA=I, diagonal D, or orthogonal AtA =D, the matrix is normal. 3.5 DETERMINANTS AND INVERSE MATRICES Definition 3.25: ‘The determinant of an nxn (square) matrix {a} is the sum of the signed products of all possible combinations of n elements, where each element is taken from a different row and column. det A = Y (-1)Paip,@2p,°- “tong (8.1) Here pi,Po,..-,Pe is a permutation of 1,2,...,, and the sum is taken over all possible permutations, A permutation is a rearrangement of 1,2, .. ..ninto some other order, such as 2,n, ...,1, that is obtained by successive transpositions. A transposition is the interchange of places of two (and only two) numbers in the list 1,2,...,n. ‘The exponent p of —1 in (3.1) is the number of transpositions it takes to go from the natural order of 1,2, ...»m to DscDs, ...jde. There are n! possible permutations of n numbers, so each determinant is the sum of n! products, 42 ELEMENTARY MATRIX THEORY [CHAP. 8 Example 37. To find the determinant of a 3X 8 matrix, all possible permutations of 1,2,3 must be found, Per forming one transposition at a time, the following table ean be formed. P| Py Py Ps of4 a3 a)aa_a 2)% 32 3/243 afaa 2 5113 2 This table is not unique in that for p= 1, possible entries are also 1,3,2 and 2,1,8, However, these entries can result only from an odd p, 20 that the sign of each product in a determinant is unique. Since there are 3!=6 terms, all possible permutations are given in the table. Notice at each step only two numbers are interchanged, Using the table and (3.1) gives det A = (—Waystaat + (I ays0aatas + (-APayatantay + (DMaigtaidas + (AQ ataye + (1 OrsOastae Theorem 3.1: Let A be an nXn matrix. ‘Then det (A") = det (A). Proof is given in Problem 3.3, page 59. Theorem 3.2: Given two nxn matrices A and B. Then det(AB) = (det A)(det B). Proof of this theorem is most easily given using exterior products, defined in Section 8.18, page 56. The proof is presented in Problem 3.15, page 65. Definition 3.26: Elementary row (or column) operations are: (i) Interchange of two rows (or columns). (ii) Multiplication of a row (or column) by a salar. (iii) Adding a scalar times a row (or column) to another row (column). To perform an elementary row (column) operation on an nx m matrix A, calculate the product EA (AE for columns), where E is an X n(m Xm) matrix found by performing the elementary operation on the n X n(m xm) unit matrix I. The matrix E is called an elemen- tary matrix. Example 3, Consider the 22 matrix A= {aj}. To interchange two rows, interchange the two rows in I to obtain E, = (9 (an an) _ (en en) ra = G olen am) ~ ay an) ‘To add 6 times the second column to the first, multiply _ fen an\(t 0 oust bo ee (in et) = (hiss Using ‘Theorem 3.2 on the product AE or EA, it can be found that (i) interchange of two rows o columns changes the sign of a determinant, i. det (AE) =—det A, (ji) multiplication of a row or column by a scalar « multiplies the determinant by a, i.e. det (AE) = « det A, and (iii) adding a sealar times a row to another row does not change the value of the de- terminant, ie, det (AE) = det A. CHAP. 3] ELEMENTARY MATRIX THEORY 43 Taking the value of a in (ii) to be zero, it can be seen that a matrix containing a row or column of zeros has a zero determinant. Furthermore, if two rows or columns are identical or multiples of one another, then use of (iii) will give a zero row or column, so that the determinant is zero. Each elementary matrix E always has an inverse E-', found by undoing the row or column operation of I. Of course an exception is «=0 in (ii). =(03). mime ea(2 Jawa (2%), Definition 3.27: The determinant of the matrix formed by deleting the ith row and the jth column of the matrix A is the minor of the element ay, denoted det My. ‘The cofactor e; = (—1)'*7 det My. Example 3 The inverse of B= (7 9) in Be 1 0, Example 3.10, ‘The minor of ag, of a 3X8 matrix A is det Mrz = ais0u,~Aists,. ‘The cofactor cas det. (1)! det Map = Theorem 3.3: ‘The Laplace expansion of the determinant of an » Xn matrix A is detA = 7 acy for any column j or detA = 2 yey for any row i. Proof of this theorem is presented as part of the proof of Theorem 3.21, page 57. Example 3.1, ‘The Laplace expansion of a 3X 3 matrix A about the second column is et = ayia + drat + Oren eal taytan — Maatas) + M5241 ~ 85204) — danl4yyA22 ~ Ayatr) Corollary 3.4: The determinant of a triangular »xn matrix equals the product of the diagonal elements. Proof is by induction. The corollary is obviously true for n= 1. For arbitrary n, the Laplace expansion about the nth row (or column) of an » x» upper (or lower) triangular matrix gives det A= dem. By assumption, ¢m = adn’ * “dy proving the corollary. Explanation of the induction method: First the hypothesis is shown true for m= mo, where no is a fixed number. (no=1 in the foregoing proof.) ‘Then assume the hypothesis is true for an arbitrary n and show it is true for n +1. Let =o, for which it is known true, so that it is true for wo+1. Then let n= 1o+1, so that it is true for mo +2, ete. In this manner the hypothesis is shown true for all > 7. Corollary 3.5: The determinant, of a diagonal matrix equals the product of the diagonal elements. This is true because a diagonal matrix is also a triangular matrix. ‘The Laplace expansion for a matrix whose kth row equals the ith row is detA = 2 tues for ki, and for a matrix whose kth column equals the jth column the Laplace expansion is detA = Svaxcy. But these determinants are zero since A has two identical rows or columns, = 44 ELEMENTARY MATRIX THEORY (onar. 3 Definition 3.28: The Kronecker delta 8;=1 if i=j and 3)=0 if ix}. Using the Kronecker notation, Saney = & aun = by deta. (82) Definition 3.29: The adjugate matrix of A is adj A= (cy)*, the transpose of the matrix of cofactors of A. The adjugate is sometimes called “adjoint”, but this term is saved for Definition 5.2. ‘Then (3.2) can be written in matrix notation as AadjA = IdetA (3.8) Definition 3.30: An m xn matrix B is a left inverse of the n xm matrix Aif BA=In, and an m Xn matrix C is a right inverse if AC Ais said to be nonsingular if it has both a left and a right inverse. If A has both a left inverse B and a right inverse C, C=1IC = BAC=BI=B. Since BA=I, and AC=I,, and if C=B, then A must be square. Furthermore suppose a non- singular matrix A has two inverses G and H. Then G=GI=GAH=IH=H so that a nonsingular matrix A must be square and have a unique inverse denoted A~* such that A“A=AA'=I. Finally, use of Theorem 3.2 gives det A det A~! = det = 1, so that if det A = 0, A can have no inverse. Theorem 3.6: Cramer's rule. Given an nxn (square) matrix A such that det A +0, Then _ deta. ‘The proof follows immediately from equation (3.3). ats adj A Example 312. The inverse of a 2X2 matrix A is ran ( Onn “) te — Grats en a, Another and usually faster means of obtaining the inverse of nonsingular matrix A is to use elementary row operations to reduce A in the partitioned matrix A|I to the unit matrix, To reduce A to a unit matrix, interchange rows until au+0. Denote the inter- change by E:. Divide the first row by ai, denoting this row operation by Es. Then EXE,A has a one in the upper left hand corner. Multiply the first row by au and subtract it from the ith row for i=2,8,...,n, denoting this operation by Es. The first column of E.E:E,A is then the unit vector e1. Next, interchange rows E:E:E,A until the element in the second row and column is nonzero. Then divide the second row by this element and subtract from all other rows until the unit vector e: is obtained. Continue in this manner until Em:--E,A= 1. Then En: +E; =A, and operating on I by the same row operations will produce A~!, Furthermore, det A~ = det E; det E,-+-detE, from Theorem 3.2, ample 8 ; Totten of (2 2, dnt oni matin to ain Oohe 6 1ilod CHAP. 3} ELEMENTARY MATRIX THEORY 45 Interchange rows (det E (o ili a) 4 pap enw oalio. Ie turns out the first column is already ¢,, and all that is necessary to reduce this matrix to Tis to add the second row to the first (det, = 1). @ ONE 7) fo rlio Divide the first row by —1 (det, = 1): ‘The matrix to the right of the partition line is the inverse of the original matrix, which has a determinant equal to {(—1)(—1)(1))-? = 1, 36 VECTOR SPACES Not all matrix equations have solutions, and some have more than one. Example 8.14, (@) ‘The matrix equation 4 0 ()® = G) has no solution because no ¢ exists that satisfies the two equations written oat as g=0 a= ()@ = () To find the necessary and sufficient conditions for existence and uniqueness of solutions of matrix equations, it is necessary to extend some geometrical ideas. ‘These ideas are apparent for vectors of 2 and 8 dimensions, and can be extended to include vectors having an arbitrary number of elements. () The matrix equation is satisfied for any ¢. Consider the vector (2 8). Since the elements are real, they can be represented as points ina plane. Let (é, g) = (2 3). Then this vector can be represented as the point in the &y plane shown in Fig. 8-1. "2 3) — ret Fig.8-1. Representation of (2 8) in the Plane 46 ELEMENTARY MATRIX THEORY (CHAP. 3 If the real vector were (1 2 8), it could be represented as a point in (é, &, &) space by drawing the é, axis out of the page. Higher dimensions, such as (1 2 8 4), are harder to draw but can be imagined. In fact some vectors have an infinite number of elements, This can be included in the discussion, as can the case where the elements of the vector are other than real numbers. Definition 3.31: Let U, be the set of all vectors with n components. Let a: and az be vectors having n components, i.e. a: and a are in Us. This is denoted a: © Us, ar€ Us. Given arbitrary scalars 8; and fs, it is seen that (81a; + Beas) © Ur, ile. an arbitrary linear combination of a: and az is in Us. ‘% is an infinite line and Us is an infinite plane. To represent diagrammatically these and, in general, ‘U, for any m, one uses the area CD e enclosed by a closed curve. Let U be a set of . vectors in Us. This can be represented as shown in Fig. 3-2. A Set of Vectors 1 in Uy Definition 3.32: A set of vectors U in Us is closed under addition if, given any a; €U and any aU, then (a: +a:)€U. Example 3.15. (a) Given 1 is the set of all 3-vectors whose elements are integers. ‘This subset of Us is closed under addi tion because the sum of any two 3-vectors whose elements are integers is also a S-vector whose elements are integers, (®) Given 1U is the set of all 2-vectors whose first because the sum of two vectors in U must give jent is unity, This set is not closed under addition ‘vector whose first element is two. Definition 3.88: A set of vectors U in Us is closed wnder scalar multiplication if, given any vector aGU and any arbitrary scalar f, then fa€U. The scalar 8 can be a real or complex number. Example 3.16. sn U is the set of all 3-vectors whose second and third elements are zero, Any scalar times any vector in U gives another vector in U, s0 U is closed under sealar multiplication. Definition 3.34: A set of n-vectors Vin Us that contains at least one vector is called a vector space if it is (1) closed under addition and (2) closed under scalar multi- plication. If aU, where U is a vector space, then 0a=0E1U because U is closed under scalar multiplication. Hence the zero vector is in every vector space. Given ai,a:,...,a», then the set of all linear combinations of the a is a vector space (linear manifold). . us { aah for all B, 3.7 BASES Definition 3.35: A vector space U in Us is spanned by the vectors a1, a2...» (Fe need not equal n) if (1) a €U,aEU,...,a€U and (2) every vector in U is a linear combination of the a1, az, .. .» ax. CHAP. 3] ELEMENTARY MATRIX THEORY ar Example 317. Given a vector space U in Us to be the set of all S-vectors whose third element is zero, ‘Then (1 2 0), (1 10) and (0 1 0) span U because any veetor in U can be represented as (= 0), and (a 80) = (allt 2 0) + Be—AIL 1 0) + 00 1 0} Definition 3.36: Vectors a ao,...,a.€ Us, are linearly dependent if there exist scalars Ai, Bs, «+» fx not all zero such that iar+fone+--- + Arak = Example 3.18. ‘The three vectors of Example 8.17 are linearly dependent because HM 2 0) — 14 1 0)- 1010) = 000) Note that any set of vectors that contains the zero vector is a linearly dependent set. Definition 337: A set of vectors are linearly independent if they are not linearly dependent. Theorem 3. If and only if the column vectors a:,a:,...,a of a square matrix A are linearly dependent, then det A = Proof: If the column vectors of A are linearly dependent, from Definition 8.86 for some PupByy «5B, NOt all zero we get 0= f,a,+ 8,0, +---+/,a,.. Denote one nonzero f as f, Then on ° Since a matrix with a zero column has a zero determinant, use of the product rule of ‘Theorem 8.2 gives detAdetE=0. Because detE = 6,0, then detA=0. Next, suppose the column vectors of A are linearly independent. Then so are the column vectors of AE, for any elementary column operation E. Proceeding stepwise as on p. 44, but now adjoining I to the bottom of A, we find E:, ...,Ensuch that AE; ---E, =I. Each step can be carried out since the column under consideration is not a linear combination of the preceding columns. Hence (det A)(det E:) -- + (detE,) = 1, so that det A #0. Using this theorem it is possible to determine if ai,az...,a, + Amex 0 if a,j, ...,a% are linearly independent. Equation (8.8) says 4” is multilinear, and (8.7) says ¢° is alternating. Example 344, The case p=0 and p=1 are degenerate, in that AU is the space of all complex numbers and AM =, the original vector space of m-vectors having dimension n. The first nontrivial example is then ‘A®U. ‘Then equations (8.6) and (9.7) become the bilinearity property (oat Ba) Am = ala; n ay) + Blas as) (3) and also Aa, = ana (3.9) Interchanging the veetors in (2.8) according to (8.9) and multiplying by —1 gives a, A (aay + Aa) = ala ra) + Alay Aa) (10) By (8.10) we see that aa, is linear in either a, or aj (bilinear) but not both because in general (aa) A (aa) % alana): Note that setting a; =a; in (3.9) gives a)Aa;=0, Furthermore if and only if a, is a linear com- bination of a, a, Aa Let by, bis + .yBy bea basis of UL. Then m ub, and ay = 3 iby 0 that 3 Zan avo ana = (3c) (3 mm) CHAP. 3] ELEMENTARY MATRIX THEORY 87 Ab, for k<1, this sum can be rearranged to Since bab, =0 and Ad, aaa = 31S can-andbeaby (eat) Because a, 0.4, is an arbitrary vector in \*U, and (811) Sa a linear combination of ty by then the vectors Beak for 12K <1 m form a basis for AU, Summing over all posible k and'T satisfying this rela- tn shove atte dimen of We nte—n12 = (*). Similar to the case \*U, if any a; is a linear combination of the other a’s, the exterior product is zero and otherwise not. Furthermore the exterior products bby +++ be for 1=i Zanen = (eu en. endas so that aza--- Aan = (Ci Cx ... Cai)’ and Theorem 3.21 is nothing more than a state- ment of the Laplace expansion of the determinant about the first column. The use of column interchanges generalizes the proof of the Laplace expansion about any column, and use of det A = det A” provides the proof of expansion about any row. Solved Problems 3.1, Multiply the following matrices, where a:,a;,b: and b, are column veetors with n elements. ® aa) aw Hom om (% ww ® Geo Go casing PE) w) le (3 Q) Using the rule for multiplication from Definition 8.13, and realizing that multiplication of a ken matrix times an m ym matrix results in a kX m matrix, we have RHR) © G3) w (252 99) 2 (22) ao wshrea 0 oon () axs+2xe = an wo CHAP. 3] ELEMENTARY MATRIX THEORY 59 3.2, Find the determinant of A (a) by direct computation, (b) by using elementary row 33, and column operations, and (c) by Laplace’s expansion, where 1002 a= {1206 13818 ooo08 (a) To facilitate direct computation, form the table Py Po Pos Pa ‘There are det 24 terms in this table. Then HL+B+1+2—1+2+8+04 160+8+0— 1606802 $1+6+8-0~1+6+1+040+6+1+0— 06641404 0000142 0-0-8604 O+1-8+0—O+t+1+2 + 0018-2 — 0-1 8-0 0124840 0+Be1 42+ 0664140 — 0464340 42-0840 2061104 2626140 — 2620160 + BeLe1 02619860 =4 (®) Using elementary row and column operations, subtract 1,8 and 4 times the bottom row from ‘the first, second and third rows respectively. This reduces A to a triangular matrix, whose determinant is the product of its diagonal elements 1+2+1+2= 4. (©) Using the Laplace expansion about the bottom row, 100 adet(1 20 algae deta, Prove that det(A’) = det A, where A is an nxn matrix. Sarena, Since a determinant i all possible combinations of products of elements where only one element is taken from each row and column, the individual torms in the sum are the same, Therefore the only question i the sign of each product. Consider a typical term from a 8X3 matrix: ayidy2A23, ile. P= 3, Py =i, pg = 2 Permute the elements through ajzts0z5 0 dz0zyAs1, 80 that the Tow numbers are in natural 1,2,3 order instead of the column numbers, From this, it ean be concluded in general that it takes exactly the same number of permutations to undo py, 2,-+-»Py t0 1,2, ...5% as it does to permute 1,2,...,n to obtain py, Pys-.-1Py- Therefore p must be the same for each product term in the series, and so the determinants are equal. Ia {jb then AT (ay) so that, det(A") 60 ELEMENTARY MATRIX THEORY [CHAP. 3 34, A Vandermonde matrix V has the form 35. a Prove V is nonsingular if and only if #70, for inj. This will be proven by showing det V = (2— 1102 — 62)(O5— 89)" * (y= Bn VO ~ By-2)°°*y~ 81) 0 rsicien For »=2, det V=,—;, which agrees with the hypothesis. By induction if the hypothesis ean be shown true for m given it is true for n—1, then the hypothesis holds for m= 2. Note each term of det V will contain one and only one element from the nth column, so that detV = ryt nee ton + Ye If 6,= 6, for i=1,2,.-.,.n—1, then detV=0 deca Gin6zy++-€a-1 ae the Toots of the polynomial, and ot rte to YO = Yun But 74-5 is the cofactor of #3” by the Laplace expansi ot act | as By assumption that the hypothesis holds for n—1, wer = TL oe Combining these relations gives “— detV = (0, ~ 6: On— 62) Show det($ G) = detAdetC, where A and C are nxn and mxm matrices respectively. Bither detA=0 or detAx0. If detA=O, then the column vectors of A are linearly Sependent, Hence the column vectors ot (9) ae linearly dependent, so that ea(S 2) =0 and the hypothesis holds. If detA0, then Av! exists and AB ‘A O/T 0)/1 AB oc o ilo clo 1 ‘The rightmost matrix is an upper triangular matrix, so its determinant is the product of the diagonal elements which is unity, Furthermore, repeated use of the Laplace expansion about the diagonal elements of 1 gives oe pee teer= Ce pote Use of the product rule of Theorem 3.2 then gives the proof. CHAP. 8] ELEMENTARY MATRIX THEORY 61 36. 37. 38, Show that if a1,a:,...,a: are a basis of U, then every vector in U is expressible uniquely as a linear combination of a:,a2,...,a. Let x be an arbitrary vector in 1/, Because x is in UU, and 1 is spanned by the basis vectors 1,82, -++)8e by definition, the question is one of uniqueness, If there are two or more linear combinations of ay, a, ...,% that represent x, they can be written ag x= Boa and Subtracting one expression from the other gives 0 = (Br—aylay + (Ba an)ag + ++ + (Pe enday Because the basis consists of linearly independent vectors, the only way this ean be an equality is for f; =a; Therefore all representations are the same, and the theorem is proved. Note that both properties of a basis were used here, If a sot of vectors did not span U, all vectors would not be expressible as a linear combination of the set. If the set did span WU but were linearly dependent, a representation of other vectors would not be unique, Given the set of nonzero vectors a:,a:,...,a in Us, Show that the set is linearly dependent if and only if some a, for 1. Then the linear combination is ZS pa, Show that if an mn matrix A has » columns with at most r linearly independent columns, then the null space of A has dimension n—r, Because A is mX~n, then a,€U,, x€ Uy. Renumber the a, so that the first rare the inde- pendent ones. The rest of the column vectors can be written as Byer = Barty b Biota +--+ Barty . (8.16) = Barty t Bymratle to + Bye Decause a,+1,+.++ay are linearly dependent and can therefore be expressed as a linear combination of the linearly independent column vectors, Construct the n—r vectors xj, %3,-..5Xqur Such that 62 39, ELEMENTARY MATRIX THEORY (CHAP. 8 en Prana Baa Pann me = | Ae pepe y (ie on) poe Ree = |G Note that Ax, =0 by the first equation of (8.16), Ax: s0 these are n—r solutions. Now it will be shown that these vectors are a basis for the mull space of A. First, they must be linearly independent because of the different positions of the —1 in the bottom part of the vectors, ‘To show they are a basis, then, it must be shown that all solutions can be expressed as a linear combination of the x, Le. it must be shown the x, span the null space of A. Consider an art folution x of Ax ='®. Then é o Been len tees tle fer fa Or, in vector notation, x= J -temts where s is a remainder if the x; do not span the null space of A. If s=0, then the x; do span the null space of A. Check that the last n—r elements of s are zero by writing the vector equality ‘as a set of scalar equations, Multiply both sides of the equation by A. Ax = 3 ~% iA + As and Ax=0, As ives 3 oa = 0. But these column vectors are nearly independent, so ‘Writing this out in terms of the column vectors of A Henee the Since Ax n—r x; are a basis, so the null space of A has dimension Show that the dimension of the vector space spanned by the row vectors of a matrix is equal to the dimension of the vector space spanned by the column vectors. Without loss of generality let the first r column vectors aj be linearly independent and let # of the row vectors a; be linearly independent. Partition A as follows Xret | @rtueet Xm | Geet CHAP. 3] ELEMENTARY MATRIX THEORY 6s 3.10. Bul. fo that m= (Oy to) ahd YP = (ty ty)... dy, Since the m are rvetors, "S bay = 0 for some nonzero b Let the vector BT = (by by -.. by) 80 that ria rey eat ray 0 = Som = ( (3 ben 'S baw... 3 bei) = Wy By... BY) Therefore by, =0 for j= 1,2 ...47 Since the last n—r column vectors aj are linearly dependent, aj = 3 ayay for i= r-+1,.. Then y= ZB ayy so that by, = B aybly) = 0 for i= r+! jn. Hence © = (Ty Bye BTY, BT ey 2 BTyy) = Byay + batg too + Betas Therefore r+1 of the row vectors a, are linearly dependent, so that #= 7. Now consider A. ‘The same argument leads to r=, 50 that r=. Show that the rank of the m xn matrix A equals the maximum number of linearly independent column vectors of A. Let there be r linearly independent column vectors, and without loss of generality let them be Ay Mey scone The a = 3 aye; for i= r+1,....2. Any y in the range space of A can be ex: Dressed in terms of an arbitrary x as y= Ae = San = Set 3, (Ru)a= 3 & («+,3, ene) This shows the a, for i=1,...,r span the range space of A, and since they are linearly inde- pendent they are a basis, so the rank of A=. For an m Xn matrix A, give necessary and sufficient conditions for the existence and uniqueness of the solutions of the matrix equations Ax=0, Ax=b and AX=B in terms of the column vectors of A. For Ax=0, the solution x=0 always exists, A necessary and sufficient condition for ‘uniqueness of the solution x=0 is that the column vectors aj,....a, are lineatly independent, To show the necessity: If a, are dependent, by definition of linearly dependent some nonzero (++ fq) 0. To show Ax = 0. exist such that 3 ag, = 0. ‘Then there exists another solution x suflcleney: If the a, are independent, only zero g; exist such that a, For Ax=b, rewrite as b= 3 aq; Then from Problem 3.10 a necessary and sufficient condition for existence of solution the column vectors, To find condi ‘that b lie in the range space of A, Le. the space spanned by son the uniqueness of solutions, write one solution as 2) and another as (fy fe-- fy). Then b = 3 ayy Zaui vo mat 0 3 (Ea ‘The solution is unique if and only if (ute 14, are nearly independent. 64 3.12. B13. 8.14, ELEMENTARY MATRIX THEORY (CHAP. 8 Whether or not b=0, necessary and sufficient conditions for existence and uniqueness of solution to Ax = b are that b lie in the vector 9} snned by the column vectors of A and thet the column vectors are linearly independent. ince AX =B can be written es Axj=b, for each column vector x, of X and b; of B, by the preceding it is required that each by lie in the range space of A and that all the column vectors, form a basis for the range space of A. Given an mxn matrix A. Show rank A = rank AT = rank ATA = rank AA’, By Theorem 3.14, the rank of A equals the number r of linearly independent column vectors of A. Hence the dimension of the vector apace spanned by the column vectors equals r. By ‘Theorem 3.18, then the dimension of the vector space spanned by the row vectors equals r. But the row vectors of A are the column vectors of AT, so AT has rank r. To show rank A = rank ATA, note both A and ATA have n columns, Then consider any vector y in the mull space of A, ie. Ay=0. Then ATAy = 6, s0 that y is also in the null space of ATA, Now consider any veetor 7 in the nll space of ATA, ic. ATAz~0. Then z7ATAz = |[Az|3= 0, so that Az =0, ie, x is also in the mull space of A. Therefore the null space of A is equal to the null space of ATA, and has some dimension &. Use of Theorem 311 gives rank A= »—k rank ATA. Substitution of AT for A in this expression gives rank A? = rank AA", Given an mx~n matrix A and an »xk matrix B, show that rank AB = rank A, rank AB 5) sine (2 2\(4 6 0 2) Also verify 1 ala a. 2 8 0 2\-1 1 0V-t/2 ay 2 8 11) (a 24 8 Givn a = (1 1 1). Find An+ 35 8 If both A~t and Bt ex , does (A+B) exist in general? Given a matrix A(t) whose elements are functions of time. Show dA-'/dt = ~A-N GPA Leta nonsingular matrix Abe partitioned into Ayy, Ayz, Az; and Azz such that Ay, and have inverses. Show that ye (Enna) (at ° a ( 1 ya (a= Ang A) then a= ( and if An =0, CHAP. 3} ELEMENTARY MATRIX THEORY 67 aa, 335. 3.36, 337. 3.38, 3.39, 3.40, Bal. 3.42, 343, st, 3.45, 3.48, sat. 38, 3.4, Are the vectors (2 0 —1 3), (1-340) and (11 ~2 2) linearly independent? 720-0 » 20-6) Given the matrix equations @ tx an(S) = 0 & Using algebraic manipulations on the scalar equations aif -+aafs=0, find the conditions under which no solutions exist and the conditions under which many solutions exist, and thus verify Theorem 8.11 and the results of Problem 8.11. Let x be in the null space of A and y be in the range space of A?, Show x" 1238 4 Ginn mae (LB 2S ))- Pind a basa forthe nal space of A 1-1 22 sum of two vectors, 2=.x-+y, where x is in the range s} ‘transpose of A, and (6) this is true for any matrix A, ce of A and y is in the null space of the For A= (2 7p)s show that () an arbitrary vector = = (£2) can be oxorenea as te Given mx k matrices A and B and an mx matrix X such that XA= XB. Under what conditions ean we conclude A=B? Given x,y in U, where by I, ...,b, are an orthonormal basis of U. Show that a» = 3 ame Given real vectors x and y such that |lz = lll. Show (x+y) is orthogonal to (x—y). Show that rank (A+ rank A+rank B, Define 7 as the operator that multiplies every vector in Us by a constant « and then adds on a translation vector ty. Is Ta linear operator? Given the three vectors a, =(VII9 —4 9), a= (Vii9 17) and y= (vIid —10 ~5), use the Gram-Schmit procedure on a, @, and ay in the order given to find a set of orthonormal bas vectors. Show that the exterior product 4? =a, A+*+ Aap satisfies Definition 3.44, i. that is an element in a generalized veetor space /\PU, the space of all linear combinations of p-fold exterior products. Show (aye, + ayes t axes) 4 (Bier + Pat + Buea) = (iby — aaPsdes + (axBs—aa8)es + (axa a08)Oar illustrating that ab is the cross product in ‘Us. Given vectors xy); .-)%q and an nXn matrix A such that ¥y, Yo» where y;= Ax. Prove that x1,%,---,%q are linearly independent. Yq are linearly independent, mens at Prove that the dimension of A'U = G=">y Show that the remainder for the Schwartz inequality oe r (a,)(b,b) — Mabie = 3 3 3 lad ashi? ‘What is the remainder for the inner product defined as (a, b) Seovoar 68 318, aus, 324, 3.25, 3.30, 3a. 34, ast, 3.38, 339. 03, 34, 3.48, 3.49, ELEMENTARY MATRIX THEORY [cHaP. 8 Answers to Supplementary Problems = Mtat ajtsing 7 bat sin (3 3) tay ont in most easily seen by the Laplace expansion. 3/2 1 4 ats ( 52 1-2 a1 No (© -410)7 and (6-50 1)7 are a basis. to = (2) ny = (2) sate ean oe tn and nex dye nnd ‘ent they span Us. ‘The n column vectors of X must be linearly independent. No, this is an affine transformation, by = (0.8 4)/5 and ay is coplanar with a; and ay so that only by and by One method of proof uses induction. a Pdr. {ff ao wo ~ ato wo paca Chapter 4 Matrix Analysis 4.1 EIGENVALUES AND EIGENVECTORS Definition 4.1: An eigenvalue of the Xn (square) matrix A is one of those scalars A that permit a nontrivial (x0) solution to the equation Ax = dx (41) Note this equation can be rewritten as (A—Al)x=0. Nontrivial solution vectors x exist only if det (A—Al) = 0, as otherwise (A—AI)~! exists to determine x = 0. Example 4 .4 Tithe goats t(® {). ‘the demas ont @ 4\/a) _ \/) ( 3)(2) Ma/ ‘Then {@ a)-9@ DH) = 6) GG) = ©) : ion i (Bs 4 VL =a8—2)- The characteristic equation is adet(* |” y",) = 0, Then (@-2)8—1) fa second-order polynomial equation whose roots are the eigenvalues 2, 0 or M-HA45=0, de Definition 42: The characteristic polynomial of A is det(A—Al). Note the characteristic polynomial is an nth order polynomial. Then there are » eigenvalues Di, Aay «yn that are the roots of this polynomial, although some might be repeated roots. An eigenvalue of the square matrix A is said to be distinct if it is not a repeated root. Associated with each eigenvalue \ of the n xn matrix A there is a nonzero solution vector x; of the eigenvalue equation Ax:=acxi. This solution vector is called an eigenvector. Example 42. In the previous example, the eigenvector assocated with the eigenvalue 1 is found as follows GE) = of) «= & dE) = @) (1 8)\¢e/ (1 2/\z, ‘Then 21, +42» =0 and x; +2sq~0, from which ¢;=—2tp, Thus the eigenvector x, is x; = (~®) 25 where the scalar 2, can be any number. \a/ Note that eigenvectors have arbitrary length. This is true because for any scalar a, the equation Ax =x has a solution vector ax since A(ax) = «Ax = adx = A(ax). 69 70 MATRIX ANALYSIS (CHAP. 4 Definition 4.5: An eigenvector is normalized to unity if its length is unity, ie. |x|] =1. Sometimes it is easier to normalize x such that one of the elements is unity. Example 43, ‘The eigenvector normalized to unit length belonging to the eigenvalue 1 in the previous example is = AL(-*), whereas normalizing its first element to unity gives x; = ( aye m= el a) v2) 42 INTRODUCTION TO THE SIMILARITY TRANSFORMATION Consider the general time-invariant continuous-time state equation with no inputs, dx(t)/dt = Ax(t) (42) where A is a constant coefficient nx matrix of real numbers. The initial condition is given as x(0) = xo, Example 44, Written out, the state equations (4.2) are Ae Oat = aed) + ayaralt) + ++ + ayer y(0) esl tht = aye s(0) + ayye(t) + 00+ + a(t) dey(O/at = aye + aqaeslt + + and the initial conditions are given, such as [#100)' 7 fe) _ fe : te] \ 2.0, \v/ Now define a new variable, an n-vector y(t), by the one to one relationship y() = Mox(t) (4) It is required that M be an x n nonsingular constant coefficient matrix so that the solution x can be determined from the solution for the new variable y(t). Putting x(t) = My(t) into the system equation gives aygta(t) May(t)/dt = AMy(t) ‘Multiplying on the left by M~' gives dy(t)/dt = M-1AMy(t) a) Definition 46: The transformation T-‘AT, where 'T is a nonsingular matrix, is called a similarity transformation on A. It is called similarity because the problem is similar to the original one but with a change of variables from x to y. ‘Suppose M was chosen very cleverly to make M~1AM a diagonal matrix A. Then u(t) \ AO... 0) fait) ay) _ a | vat) Om .. 0 \f te a = dil : alee see f[ | = A¥O wo! \o 0 .. \ie/ CHAP. 4] MATRIX ANALYSIS n Writing this equation out gives dy/dt= Ay; for i=1,2,...,n. The solution can be ex- Pressed simply as yi(t) = y(0)e'. ‘Therefore if an M such that M-1AM = A can be found, solution of dx/dt = Ax becomes easy. Although not always, such an M can usually be found. In cases where it cannot, a T can always be found where T-1AT is almost diagonal. Physically it must be the case that not all differential equations can be reduced to this simple form, Some differential equations have as solutions te“, and there is no way to get this solution from the simple form. The transformation M is constructed upon solution of the eigenvalue problem for all the eigenvectors x, i=1,2,...,7. Because Axi=Ax, for i=1,2,...,, the equations can be “stacked up” using the rules of multiplication of partitioned matrices: Ali [x2]... [xe) = (Axi | AXs |... | Ax) = (hams [Aue |. | Anata) a rr) = (aise)... |x| 9 2 Oe Omeerteras = (xifae] [xa Therefore M = (xi |2|-.. [a) 5) When M is singular, A cannot be found. Under a number of different conditions, it can be shown M is nonsingular. One of these conditions is stated as the next theorem, and other conditions will be found later. Theorem 4.1: If all the eigenvalues of an nxn matrix are distinct, the eigenvectors are linearly independent. Note that if the eigenvectors are linearly independent, M is nonsingular. Proof: The proof is by contradiction. Let A have distinct eigenvalues. Let xi,x2, .. -»%n be the eigenvectors of A, with x1,xo...,x+ independent and x.+1,...,3 dependent, ‘Then x, = Dax, for j= k+1, 442, +m where not all g,=0. Since x, is an eigenvector, Also, Subtracting this equation from the Previous one gives © = FA a—alx, But the x, i= 1,2,...,k, were assumed to be linearly independent. Because not all By are zero, some A=. This contradicts the assumption that A had distinct eigenvalues, and so all the eigenvectors of A must be linearly independent. 43 PROPERTIES OF SIMILARITY TRANSFORMATIONS To determine when a T can be found such that T-1AT gives a diagonal matrix, the Properties of a similarity transformation must be examined. Define S= TUT (66) 72 MATRIX ANALYSIS: [CHAP. 4 Then the eigenvalues of S are found as the roots of det (S— But det (S—al) = det (T-1AT— Ar) = det (TAT — ATT) = det (T-A — ANT] Using the product rule for determinants, det (S—Al) = det'T-1 det (A—Al) det T Since detT-* = (detT)-! from Problem 8.12, det (S—I)=det(A~AI). Therefore we have proved Theorem 4.2: All similar matrices have the same eigenvalues. Corollary 4.3: All similar matrices have the same traces and determinants. Proof of this corollary is given in Problem 4.1. A useful fact to note here also is that all triangular matrices B display eigenvalues on the diagonal, because the determinant of the triangular matrix (B-l) is the product of its diagonal elements. A matrix A can be reduced to a diagonal matrix A by a similarity trans- formation if and only if a set of m linearly independent eigenvectors can be found. Theorem 4. Proof: By Theorem 4.2, the diagonal matrix A must have the eigenvalues of A appear- ing on the diagonal. If AT=TA, by partitioned matrix multiplication it is required that Atc= At, where t, are the column vectors of T. Therefore it is required that T have the eigenvectors of A as its column vectors, and T-? exists if and only if its column vectors are linearly independent. It has already been shown that when the eigenvalues are distinct, T is nonsingular. So consider what happens when the eigenvalues are not distinct. Theorem 4.4 says that the only way we can obtain a diagonal matrix is to find linearly independent eigenvectors. ‘Then there are two cases: Case 1. For each root that is repeated & times, the space of eigenvectors belonging to that root is k-dimensional. In this case the matrix can still be reduced to a diagonal form, Example 45, 100 Given the matrix 111), Then det(a—am “10 9, 0,1and 1. For the zero eigenvalue, solution of Ax=0 gives x= (01-1), For the unity eigenvalue, the eigenvalue problem is “«) -N1—))? and the eigenvalues sre ‘This gives the set of equations CHAP. 4] MATRIX ANALYSIS 3 ‘Therefore all eigenvectors belonging to the eigenvalue 1 have the form =O. where 2, and 2; are arbitrary. Hence any two linearly independent vectors in the space spanned by (@ 1 0) and (1 0 —1) will do. The transformation matrix is then oo 4 T m= (2110 104 and T-1AT=A, where A has 0,1 and 1 on the diagonal in that order. Note that the occurrence of distinct eigenvalues falls into Case 1. Every distinct eigen- value must have at least one eigenvector associated with it, and since there are n distinct eigenvalues there are n eigenvectors, By Theorem 4.1 these are linearly independent. Case 2. The conditions of Case 1 do not hold. ‘Then the matrix cannot be reduced to a diagonal form by a similarity transformation. Example 46. Given he mats a= (22). sine Ate nena, he devalues are payed as Xen 2 1 N/a) wg /At GG) =) which gives the set of equations #, = 0, 00, All eigenvectors belonging to 1 have the form (e: 0% Two tinerly independent eigenvectors are simply ot available to form M. ‘Then the eigenvalue problem is Because in Case 2 a diagonal matrix cannot be formed by a similarity transformation, there arises the question of what is the simplest matrix that is almost diagonal that can be formed by a similarity transformation. This is answered in the next section. 44 JORDAN FORM ‘The form closest to diagonal to which an arbitrary Xn matrix can be transformed by a similarity transformation is the Jordan form, denoted J. Proof of its existence in all cases can be found in standard texts. In the interest of brevity we omit the lengthy develop- ment needed to show this form can always be obtained, and merely show how to obtain it. ‘The Jordan form J is an upper triangular matrix and, as per the remarks of the preceding section, the eigenvalues of the A matrix must be displayed on the diagonal. If the A matrix has r linearly independent eigenvectors, the Jordan form has n—r ones above the diagonal, and all other elements are zero. The general form is Gn 4 MATRIX ANALYSIS [OHAP. 4 Each Ly(\) is an upper triangular square matrix, called a Jordan block, on the diagonal of the Jordan form J. Several Lj(\:) can be associated with each value of \, and may differ in dimension from one another. A general Ly(X) looks like M10... 0 Oul.. 0 lu) = | 0 0 x (4.8) where A, are on the diagonal and ones occur in all places just above the diagonal. Example 47. Seon _for oo Consider the Jordan form 3 =|) {9 |. Because all ones must occur above the 0 0 0 % diagonal in a Jordan block, wherever a zero above the diagonal occurs in J there must occur a boundary between two Jordan blocks, Therefore this J contains three Jordan blocks, (ud taon = (Ff), taoo = ay tah = 92 There is one and only one linearly independent eigenvector associated with each Jordan Dlock and vice versa. ‘This leads to the calculation procedure for the other column vectors t: of T called generalized eigenvectors associated with each Jordan block L;(A): Ax, = Axi At = At tx At = Ate+t (4.9) Diet tea Ati Note the number of t: equals the number of ones in the associated Ly(\). Then A(ui|ti [ta]... [ti] ...) = ace] Atta | Ade tty |... [ator (i) fata). te] La) ‘This procedure for calculating the t: works very well as long as x, is determined to within a multiplicative constant, because then each t; is determined to within a multiplicative constant, However, difficulty is encountered whenever there is more than one Jordan block associated with a single value of an eigenvalue. Considerable background in linear algebra is required to find a construction procedure for the t; in this case, which arises so seldom in practice that the general case will not be pursued here. If this case arises, a trial and error procedure along the lines of the next example can be used. Example 48. Find the transformation matrix that reduces the matrix A to Jordan form, where 244 A 034 oa 1 CHAP. 4] MATRIX ANALYSIS 6 ‘The characteristic equation is (2—-2)@—2(L—2)+(@—2) = 0. A factor 2—2 can be removed, and the remaining equation can be arranged so that the characteristic equation becomes (2—))! Solving for the eigenvectors belonging to the eigenvalue 2 results in G28) -@ ‘Therefore any eigenvector ean be expressed in a linear combination as “Oe () ‘What combination should be tried to start the procedure described by equations (4.8)? Trying the general ey zl i Then ntnusa atn=B -aon = 78 ‘These equations are satisfied if «= This gives the correct @=1 gives t= (7 rt. 1-7)". The transformation matrix pendent choice of x, say (01 —1)¥, and any choice of r, and 7, such that ¢ is linear! choices of all x, egy r,=0 and %y=1, This gives AT=J, or 21 1\/1 0 0 1 0 o\/2 1 0 omsret oe aia 2 ona 1/\-1 0-1, -1 0-1/\o 0 2, 4.5 QUADRATIC FORMS Definition 4.7: A quadratic form Q is a real polynomial in the real variables ¢, é ...»¢, — (1 1-1)", Normalizing x by setting completed by any other linearly inde independent of the containing only terms of the form ayf,¢,, such that Q where a, is real for all ¢ and j. Example 49. Some typical quadratic forms are Qe Qe = 8b — Zinta + ef + 5b — Teak Qs = ant} + asafiée + ensiats + anats Qe = BE +O Pea — ey Theorem 4.5: All quadratic forms Q can be expressed as the inner product (x,Qx) and vice versa, where Q is ann x n Hermitian matrix, ie. Q*= Proof: First 2 to (x, Qx) yids (4-10) 76 MATRIX ANALYSIS (CHAP. 4 Let Q= (0,) = Ha,+a,}. Then 9,=4,, so Q is real and symmetric, and Q=x"Qx. Next, (x,@x) to Q (the problem is to prove the coefficients are real ax) = PS ase, and wary = SE apes, ‘Then (x, Qx) AG, Qx) + Hx, Qtx) = 4 z 2 (ay + OEE, So (x, Qx) =z z Re (q,)éf,=Q and the coefficients are real. Theorem 46: The eigenvalues of an nxn Hermitian matrix Q=@Q are real, and the eigenvectors belonging to distinct eigenvalues are orthogonal. The most important case of real symmetric Q is included in Theorem 4.6 because the set of real symmetric matrices is included in the set of Hermitian matrices. Proof: The eigenvalue problems for specific A: and .; are Qx = do (4.11) x = Ax; aa Since Q is Hermitian, Qtx; = Ax) Taking the complex conjugate transpose gives x1Q = Nxt (4.12) Multiplying (4.12) on the right by x: and (4.17) on the left by x} gives xf Qx,—xfQx, = 0 = (Af —A)xtx, It j =i, then /xfx, is a norm on x, and cannot be zero, so that A, =A", meaning each eigen- value is real. Then if ji, X—A=a,—A,. But for distinct eigenvalues, a,—,* 0, 80 x}x,= 0 and the eigenvectors are orthogonal. Theorem 4.7: Even if the eigenvalues are not distinct, a set of n orthonormal eigenvectors can be found for an Xn normal matrix N. ‘The proof is left to the solved problems. Note both Hermitian and real symmetric matrices are normal so that Theorem 4.6 is a special case of this theorem. Corollary 48: A Hermitian (or real symmetric) matrix Q can always be reduced to a diagonal matrix by a unitary transformation, where U-'QU=A and U-= Ut, Proof: Since Q is Hermitian, it is also normal. Then by Theorem 4.7 there are orthonormal eigenvectors and they are all independent. By Theorem 4.4 this is a necessary and sufficient condition for diagonalization. To show a transformation matrix is unitary, construct U with the orthonormal eigenvectors as column vectors. Then xtx, xtx, ... atx, xix, xix, xt, is / (x: [xe xix, xix, CHAP. 4) MATRIX ANALYSIS 7 But x{x,=(x,x) = 8, because they are orthonormal. Then U'U=1, Since the column veetors of U are linearly independent, U~t exists, so multiplying on the right by U~* gives Ut =U", which was to be proven, ‘Therefore if a quadratic form Q=xtQx is given, rotating coordinates by defining Uy gives Q = ytUtQUy = ytdy. In other words, Q can be expressed as Q = Altaf + Aghia + +++ + Anlanl? where the A; are the real eigenvalues of Q. Note Q is always positive if the eigenvalues of Qare positive, unless y, and hence x, is identically the zero vector. Then the square root of Q is a norm of the x vector because an inner product can be defined as (x,y)g = xtQy. Definition 4.8: Ann xn Hermitian matrix Q is positive definite if its associated quadratic form Q is always positive except when x is identically the zero vector. ‘Then Q is positive definite if and only if all its eigenvalues are > 0. An nxn Hermitian matrix Q is nonnegative definite if its associated quadratic form Q is never negative. (It may be zero at times when x is, not zero.) ‘Then Q is nonnegative if and only if all its eigenvalues are = 0. Example 410. Q= G2 ted = 4 — Ga? can be zero when ‘The geometric solution of constant 2 when Q is positive definite is an ellipse in n-space. i» and a0 is nonnegative definite. Theorem 49: A unique positive definite Hermitian matrix R exists such that RR =Q, where Q is a Hermitian positive definite matrix. R is called the square root of. Proof: Let U be the unitary matrix that diagonalizes Q. Then Q=UAUS. Since Au is a positive diagonal element of A, defineA’” as the diagonal matrix of positive Aj/*. Q = UAwAUT = UAMUTUAM UF Now let R= UA™Ut and it is symmetric, real and positive definite because its eigenvalues are positive. Uniqueness is proved in Problem 4.5. One way to check if a Hermitian matrix is positive definite (or nonnegative definite) is to see if its elgenvalues are all positive (or nonnegative). Another way to check is to use Sylvester's criterion, Definition 4.10: The mth leading principal minor, denoted det Qu, of the nxn Hermitian matrix Q is the determinant of the matrix Q,, formed by deleting the last n—m rows and columns of Q. Theorem 4.10: A Hermitian matrix Q is positive definite if and only if all the leading principal minors of Q are positive. A proof is given in Problem 4.6, Example 4.11, Given Q= (qu). Then Q is positive definite if and only if 0 < det = en; 0 < det = ae (HEM): ea, If < is replaced by =, we cannot conclude @ is nonnegative definite, Rearrangement of the elements of Q sometimes leads to simpler algebraic inequalities.

You might also like