System Realization by Information Matrix (SRIM)#

For discrete-time systems, the correlation between inputs, outputs, and state yield information about the system’s state evolution and response. In fact, the state equations can be estimated by manipulating correlation matrices through the method, System Realization by Information Matrix (SRIM).

Discrete-Time System Matrix Equation#

We begin with discrete-time state equations that correspond to the structure’s dynamics (see Discrete LTI State-Space Representation).

\[\begin{split}\begin{aligned} \boldsymbol{x}(k+1) &= \boldsymbol{Ax}(k) + \boldsymbol{Bu}(k) \\ \boldsymbol{y}(k) &= \boldsymbol{Cx}(k) + \boldsymbol{Du}(k) \end{aligned}\end{split}\]

By noting the state evolution

\[\begin{split}\begin{aligned} \boldsymbol{x}(k+1) &= \boldsymbol{Ax}(k)+\boldsymbol{B}\boldsymbol{U}_{p}(k)\\ \boldsymbol{x}(k+2) &= \boldsymbol{A}^2\boldsymbol{X}(k) + \boldsymbol{AB}\boldsymbol{U}_{p}(k) + \boldsymbol{Bu}(k+1)\\ \boldsymbol{x}(k+3) &= \boldsymbol{A}^{3}\boldsymbol{X}(k) + \boldsymbol{A}^{2}\boldsymbol{Bu}(k) + \boldsymbol{ABu}(k+1) + \boldsymbol{Bu}(k+2), \end{aligned}\end{split}\]

we can generalize the response for the timepoint \(k+p-1\):

\[\begin{split}\begin{aligned} \boldsymbol{x}(k+p) &= \boldsymbol{A}^{p}\boldsymbol{x}(k) + \sum_{i=1}^{p}\boldsymbol{A}^{p-i}\boldsymbol{Bu}(k+i-1) \\ \boldsymbol{x}(k+p-1) &= \boldsymbol{A}^{p-1}\boldsymbol{x}(k) + \sum_{i=1}^{p-1}\boldsymbol{A}^{p-i-1}\boldsymbol{Bu}(k+i-1) \\ \boldsymbol{y}(k+p-1) &= \boldsymbol{CA}^{p-1}\boldsymbol{x}(k) + \sum_{i=1}^{p-1}\boldsymbol{CA}^{p-i-1}\boldsymbol{Bu}(k+i-1)+\boldsymbol{Du}(k+p-1)~. \end{aligned}\end{split}\]

Then, we can vertically stack \(p\) successive time-points into a column vector and express this vector as \(\boldsymbol{y}_{p}(k)\):

\[\begin{split}\begin{aligned} \boldsymbol{y}_{p}(k) &= \mathcal{O}_{p}\boldsymbol{x}(k) + \mathcal{T}_{p}\boldsymbol{u}_{p}(k) \\ \begin{bmatrix} \boldsymbol{y}(k) \\ \boldsymbol{y}(k+1) \\ \vdots \\ \boldsymbol{y}(k+p-1) \end{bmatrix} =& \begin{bmatrix} \boldsymbol{C} \\ \boldsymbol{CA} \\ \boldsymbol{CA}^{2} \\ \vdots \\ \boldsymbol{CA}^{p-1} \end{bmatrix} \boldsymbol{x}(k) ~+ \\ & \begin{bmatrix} \boldsymbol{D} \\ \boldsymbol{CB} & \boldsymbol{D} \\ \boldsymbol{CAB} & \boldsymbol{CB} & \boldsymbol{D} \\ \vdots & \vdots & \vdots & \ddots \\ \boldsymbol{CA}^{p-2}\boldsymbol{B} & \boldsymbol{CA}^{p-3}\boldsymbol{B} & \boldsymbol{CA}^{p-4}\boldsymbol{B} & \cdots & \boldsymbol{D} \end{bmatrix} \begin{bmatrix} \boldsymbol{u}(k) \\ \boldsymbol{u}(k+1) \\ \vdots \\ \boldsymbol{u}(k+p-1) \end{bmatrix}~. \end{aligned}\end{split}\]

Finally, we horizontally stack \(N\) successive timepoints of these column vectors in a matrix, to get the matrix equation

\[\boxed{\boldsymbol{Y}_{p}(k) = \mathcal{O}_{p}\boldsymbol{X}(k) + \mathcal{T}_{p}\boldsymbol{U}_{p}(k)} ~,\]

where

\[\begin{split}\begin{aligned} \boldsymbol{Y}_{p}(k) &= \begin{bmatrix} \boldsymbol{y}_{p}(k) & \boldsymbol{y}_{p}(k+1) & \cdots & \boldsymbol{y}_{p}(k+N-1) \end{bmatrix} \\ &= \begin{bmatrix} \boldsymbol{y}(k) & \boldsymbol{y}(k+1) & \cdots & \boldsymbol{y}(k+N-1)\\ \boldsymbol{y}(k+1) & \boldsymbol{y}(k+2) & \cdots & \boldsymbol{y}(k+N) \\ \vdots & \vdots & \ddots & \vdots \\ \boldsymbol{y}(k+p-1) & \boldsymbol{y}(k+p) & \cdots & \boldsymbol{y}(k+N+p-2) \end{bmatrix} \end{aligned}\end{split}\]
\[\boldsymbol{X}(k) = \begin{bmatrix} \boldsymbol{x}(k) & \boldsymbol{x}(k+1) & \cdots & \boldsymbol{x}(k+N-1) \end{bmatrix}\]
\[\begin{split}\begin{aligned} \boldsymbol{U}_{p}(k) &= \begin{bmatrix} \boldsymbol{u}_{p}(k) & \boldsymbol{u}_{p}(k+1) & \cdots & \boldsymbol{u}_{p}(k+N-1) \end{bmatrix} \\ &= \begin{bmatrix} \boldsymbol{u}(k) & \boldsymbol{u}(k+1) & \cdots & \boldsymbol{u}(k+N-1)\\ \boldsymbol{u}(k+1) & \boldsymbol{u}(k+2) & \cdots & \boldsymbol{u}(k+N) \\ \vdots & \vdots & \ddots & \vdots \\ \boldsymbol{u}(k+p-1) & \boldsymbol{u}(k+p) & \cdots & \boldsymbol{u}(k+N+p-2) \end{bmatrix}~. \end{aligned}\end{split}\]

Observability Matrix from Information Matrix#

By post-multiplying the matrix equation by \(\frac{1}{N}\boldsymbol{U}_{p}^{T}(k)\), \(\frac{1}{N}\boldsymbol{Y}_{p}^{T}(k)\) or \(\frac{1}{N}\boldsymbol{X}_{p}^{T}(k)\), we obtain relationships between correlation matrices \(\boldsymbol{R}_{yy}\), \(\boldsymbol{R}_{yu}\), \(\boldsymbol{R}_{uu}\), and \(\boldsymbol{R}_{xx}\) (See Appendix).

\[\boldsymbol{R}_{yy} - \boldsymbol{R}_{yu}\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} = \mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T} ~,\]

where

\[\begin{split}\begin{aligned} \boldsymbol{R}_{yy} &= \frac{1}{N}\boldsymbol{Y}_{p}(k)\boldsymbol{Y}_{p}^{T}(k), \quad{} \boldsymbol{R}_{yu} = \frac{1}{N}\boldsymbol{Y}_{p}(k)\boldsymbol{U}_{p}^{T}(k) \\ \boldsymbol{R}_{uu} &= \frac{1}{N}\boldsymbol{U}_{p}(k)\boldsymbol{U}_{p}^{T}(k) , \quad{} \boldsymbol{R}_{xx} = \frac{1}{N}\boldsymbol{X}(k)\boldsymbol{X}^{T}(k) ~. \end{aligned}\end{split}\]

The left side of the equation is found from input and output measurements, and is called the information matrix of the data. Its singular value decomposition is computed to yield the observability matrix \(\mathcal{O}_{p}\).

\[\boldsymbol{R}_{yy} - \boldsymbol{R}_{yu}\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} = \boldsymbol{U} \Sigma \boldsymbol{U}^{T} = \mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T} ~.\]

State Equation Matrices from Observability Matrix#

Now, the state equation matrices \(\boldsymbol{A}\) and \(\boldsymbol{C}\) can be obtained from the observability matrix \(\mathcal{O}_p\).

\[\begin{split}\begin{aligned} \mathcal{O}_{p} = \begin{bmatrix} \boldsymbol{C} \\ \boldsymbol{CA} \\ \boldsymbol{CA}^{2} \\ \vdots \\ \boldsymbol{CA}^{p-1} \end{bmatrix} , \quad{} \mathcal{O}_{p}(:-1) = \begin{bmatrix} \boldsymbol{C} \\ \boldsymbol{CA} \\ \boldsymbol{CA}^{2} \\ \vdots \\ \boldsymbol{CA}^{p-2} \end{bmatrix} , \quad{} \mathcal{O}_{p}(1:) = \begin{bmatrix} \boldsymbol{CA} \\ \boldsymbol{CA}^{2} \\ \boldsymbol{CA}^{3} \\ \vdots \\ \boldsymbol{CA}^{p-1} \end{bmatrix} \end{aligned}\end{split}\]
\[\boldsymbol{A} = \mathcal{O}_{p}(:-1)^{+}\mathcal{O}_{p}(1:)\]
\[\boldsymbol{C} = \mathcal{O}_{p}(0)\]

Appendix: Manipulation of discrete-time system matrix equation into correlation matrix relationships#

In (Juang 1997), the discrete-time system matrix equation is manipulated into a form describing the relationship between correlation matrices \(\boldsymbol{R}_{yy}\), \(\boldsymbol{R}_{yu}\), \(\boldsymbol{R}_{uu}\), and \(\boldsymbol{R}_{xx}\).

Post-multiplying the discrete-time system matrix equation by \(\frac{1}{N}\boldsymbol{U}_{p}^{T}(k)\):

\[\begin{split}\begin{aligned} \frac{1}{N}\boldsymbol{Y}_{p}(k)\boldsymbol{U}_{p}^{T}(k) &= \mathcal{O}_{p}\frac{1}{N}\boldsymbol{X}(k)\boldsymbol{U}_{p}^{T}(k) + \mathcal{T}_{p}\frac{1}{N}\boldsymbol{U}_{p}(k)\boldsymbol{U}_{p}^{T}(k) \\ \boldsymbol{R}_{yu} &= \mathcal{O}_{p}\boldsymbol{R}_{xu} + \mathcal{T}_{p}\boldsymbol{R}_{uu} \\ \mathcal{T}_{p} &= \left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1} \end{aligned}\end{split}\]

Post-multiplying by \(\frac{1}{N}\boldsymbol{Y}_{p}^{T}(k)\):

\[\begin{split}\begin{aligned} \frac{1}{N}\boldsymbol{Y}_{p}(k)\boldsymbol{Y}_{p}^{T}(k) &= \mathcal{O}_{p}\frac{1}{N}\boldsymbol{X}(k)\boldsymbol{Y}_{p}^{T}(k) + \mathcal{T}_{p}\frac{1}{N}\boldsymbol{U}_{p}(k)\boldsymbol{Y}_{p}^{T}(k) \\ \boldsymbol{R}_{yy} &= \mathcal{O}_{p}\boldsymbol{R}_{yx}^{T} + \mathcal{T}_{p}\boldsymbol{R}_{yu}^{T} \\ \boldsymbol{R}_{yy} &= \mathcal{O}_{p}\boldsymbol{R}_{yx}^{T} + \left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} \end{aligned}\end{split}\]

Post-multiplying by \(\frac{1}{N}\boldsymbol{X}_{p}^{T}(k)\):

\[\begin{split}\begin{aligned} \frac{1}{N}\boldsymbol{Y}_{p}(k)\boldsymbol{X}_{p}^{T}(k) &= \mathcal{O}_{p}\frac{1}{N}\boldsymbol{X}(k)\boldsymbol{X}_{p}^{T}(k) + \mathcal{T}_{p}\frac{1}{N}\boldsymbol{U}_{p}(k)\boldsymbol{X}_{p}^{T}(k) \\ \boldsymbol{R}_{yx} &= \mathcal{O}_{p}\boldsymbol{R}_{xx} + \mathcal{T}_{p}\boldsymbol{R}_{xu}^{T} \\ \boldsymbol{R}_{yx} &= \mathcal{O}_{p}\boldsymbol{R}_{xx} + \left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{xu}^{T} \end{aligned}\end{split}\]

Substituting the equation for \(\boldsymbol{R}_{yx}\) into the equation for \(\boldsymbol{R}_{yy}\):

\[\begin{split}\begin{aligned} \boldsymbol{R}_{yy} =& ~\mathcal{O}_{p} \left(\mathcal{O}_{p}\boldsymbol{R}_{xx} + \left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{xu}^{T}\right)^{T} \\ &+ \left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} \\ =& ~\mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T} + \mathcal{O}_{p}\boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \left( \boldsymbol{R}_{yu}^{T} - \boldsymbol{R}_{xu}^{T}\mathcal{O}_{p}^{T} \right) \\ &+ \left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} \\ =& ~\mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T} + \mathcal{O}_{p}\boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \boldsymbol{R}_{yu}^{T} - \mathcal{O}_{p}\boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \boldsymbol{R}_{xu}^{T}\mathcal{O}_{p}^{T} \\ &+ \boldsymbol{R}_{yu}\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} \\ =& ~\mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T} - \mathcal{O}_{p}\boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \boldsymbol{R}_{xu}^{T}\mathcal{O}_{p}^{T} + \boldsymbol{R}_{yu}\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} \end{aligned}\end{split}\]

Moving all of the terms that can be composed with measured data to the left side:

\[\begin{split}\begin{aligned} \boldsymbol{R}_{yy} - \boldsymbol{R}_{yu}\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} &= \mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T} - \mathcal{O}_{p}\boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \boldsymbol{R}_{xu}^{T}\mathcal{O}_{p}^{T} \\ &= \mathcal{O}_{p}\left( \boldsymbol{R}_{xx} - \boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \boldsymbol{R}_{xu}^{T} \right) \mathcal{O}_{p}^{T} \end{aligned}\end{split}\]

We make the assumption that all current and future input data is uncorrelated with the current state, which means that the average of the products \(\boldsymbol{x}(k)\boldsymbol{u}(k+i), ~~ i \in [0,1,2,\dots]\) is zero. This gives:

\[\begin{split}\begin{aligned} \boldsymbol{R}_{xu} &= \frac{1}{N} \begin{bmatrix} \sum_{j=0}^{N-1}\boldsymbol{x}(k+j)\boldsymbol{u}(k+j) \\ \sum_{j=0}^{N-1}\boldsymbol{x}(k+j)\boldsymbol{u}(k+j+1) \\ \sum_{j=0}^{N-1}\boldsymbol{x}(k+j)\boldsymbol{u}(k+j+2) \\ \vdots \\ \sum_{j=0}^{N-1}\boldsymbol{x}(k+j)\boldsymbol{u}(k+j+p-1) \end{bmatrix}^{T} \\ &= \boldsymbol{0} \end{aligned}\end{split}\]

in order to yield:

\[\boldsymbol{R}_{yy} - \boldsymbol{R}_{yu}\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} = \mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T}~.\]