System Realization by Information Matrix (SRIM)
For discrete-time systems, the correlation between inputs, outputs, and
state yield information about the system’s state evolution and response.
In fact, the state equations can be estimated by manipulating
correlation matrices through the method, System Realization by
Information Matrix (SRIM).
Discrete-Time System Matrix Equation
We begin with discrete-time state equations that correspond to the
structure’s dynamics (see Discrete LTI State-Space
Representation).
\[\begin{split}\begin{aligned}
\boldsymbol{x}(k+1) &= \boldsymbol{Ax}(k) + \boldsymbol{Bu}(k) \\
\boldsymbol{y}(k) &= \boldsymbol{Cx}(k) + \boldsymbol{Du}(k)
\end{aligned}\end{split}\]
By noting the state evolution
\[\begin{split}\begin{aligned}
\boldsymbol{x}(k+1) &= \boldsymbol{Ax}(k)+\boldsymbol{B}\boldsymbol{U}_{p}(k)\\
\boldsymbol{x}(k+2) &= \boldsymbol{A}^2\boldsymbol{X}(k) + \boldsymbol{AB}\boldsymbol{U}_{p}(k) + \boldsymbol{Bu}(k+1)\\
\boldsymbol{x}(k+3) &= \boldsymbol{A}^{3}\boldsymbol{X}(k) + \boldsymbol{A}^{2}\boldsymbol{Bu}(k) + \boldsymbol{ABu}(k+1) + \boldsymbol{Bu}(k+2),
\end{aligned}\end{split}\]
we can generalize the response for the timepoint \(k+p-1\):
\[\begin{split}\begin{aligned}
\boldsymbol{x}(k+p) &= \boldsymbol{A}^{p}\boldsymbol{x}(k) + \sum_{i=1}^{p}\boldsymbol{A}^{p-i}\boldsymbol{Bu}(k+i-1)
\\
\boldsymbol{x}(k+p-1) &= \boldsymbol{A}^{p-1}\boldsymbol{x}(k) + \sum_{i=1}^{p-1}\boldsymbol{A}^{p-i-1}\boldsymbol{Bu}(k+i-1)
\\
\boldsymbol{y}(k+p-1) &= \boldsymbol{CA}^{p-1}\boldsymbol{x}(k) + \sum_{i=1}^{p-1}\boldsymbol{CA}^{p-i-1}\boldsymbol{Bu}(k+i-1)+\boldsymbol{Du}(k+p-1)~.
\end{aligned}\end{split}\]
Then, we can vertically stack \(p\) successive time-points into a
column vector and express this vector as \(\boldsymbol{y}_{p}(k)\):
\[\begin{split}\begin{aligned}
\boldsymbol{y}_{p}(k) &= \mathcal{O}_{p}\boldsymbol{x}(k) + \mathcal{T}_{p}\boldsymbol{u}_{p}(k) \\
\begin{bmatrix}
\boldsymbol{y}(k) \\
\boldsymbol{y}(k+1) \\
\vdots \\
\boldsymbol{y}(k+p-1)
\end{bmatrix}
=&
\begin{bmatrix}
\boldsymbol{C} \\
\boldsymbol{CA} \\
\boldsymbol{CA}^{2} \\
\vdots \\
\boldsymbol{CA}^{p-1}
\end{bmatrix}
\boldsymbol{x}(k)
~+ \\
&
\begin{bmatrix}
\boldsymbol{D} \\
\boldsymbol{CB} & \boldsymbol{D} \\
\boldsymbol{CAB} & \boldsymbol{CB} & \boldsymbol{D} \\
\vdots & \vdots & \vdots & \ddots \\
\boldsymbol{CA}^{p-2}\boldsymbol{B} & \boldsymbol{CA}^{p-3}\boldsymbol{B} & \boldsymbol{CA}^{p-4}\boldsymbol{B} & \cdots & \boldsymbol{D}
\end{bmatrix}
\begin{bmatrix}
\boldsymbol{u}(k) \\
\boldsymbol{u}(k+1) \\
\vdots \\
\boldsymbol{u}(k+p-1)
\end{bmatrix}~.
\end{aligned}\end{split}\]
Finally, we horizontally stack \(N\) successive timepoints of these
column vectors in a matrix, to get the matrix equation
\[\boxed{\boldsymbol{Y}_{p}(k) = \mathcal{O}_{p}\boldsymbol{X}(k) + \mathcal{T}_{p}\boldsymbol{U}_{p}(k)} ~,\]
where
\[\begin{split}\begin{aligned}
\boldsymbol{Y}_{p}(k) &= \begin{bmatrix} \boldsymbol{y}_{p}(k) & \boldsymbol{y}_{p}(k+1) & \cdots & \boldsymbol{y}_{p}(k+N-1) \end{bmatrix} \\
&= \begin{bmatrix}
\boldsymbol{y}(k) & \boldsymbol{y}(k+1) & \cdots & \boldsymbol{y}(k+N-1)\\
\boldsymbol{y}(k+1) & \boldsymbol{y}(k+2) & \cdots & \boldsymbol{y}(k+N) \\
\vdots & \vdots & \ddots & \vdots \\
\boldsymbol{y}(k+p-1) & \boldsymbol{y}(k+p) & \cdots & \boldsymbol{y}(k+N+p-2)
\end{bmatrix}
\end{aligned}\end{split}\]
\[\boldsymbol{X}(k) = \begin{bmatrix} \boldsymbol{x}(k) & \boldsymbol{x}(k+1) & \cdots & \boldsymbol{x}(k+N-1) \end{bmatrix}\]
\[\begin{split}\begin{aligned}
\boldsymbol{U}_{p}(k) &= \begin{bmatrix} \boldsymbol{u}_{p}(k) & \boldsymbol{u}_{p}(k+1) & \cdots & \boldsymbol{u}_{p}(k+N-1) \end{bmatrix} \\
&= \begin{bmatrix}
\boldsymbol{u}(k) & \boldsymbol{u}(k+1) & \cdots & \boldsymbol{u}(k+N-1)\\
\boldsymbol{u}(k+1) & \boldsymbol{u}(k+2) & \cdots & \boldsymbol{u}(k+N) \\
\vdots & \vdots & \ddots & \vdots \\
\boldsymbol{u}(k+p-1) & \boldsymbol{u}(k+p) & \cdots & \boldsymbol{u}(k+N+p-2)
\end{bmatrix}~.
\end{aligned}\end{split}\]
State Equation Matrices from Observability Matrix
Now, the state equation matrices \(\boldsymbol{A}\) and
\(\boldsymbol{C}\) can be obtained from the observability matrix
\(\mathcal{O}_p\).
\[\begin{split}\begin{aligned}
\mathcal{O}_{p}
=
\begin{bmatrix}
\boldsymbol{C} \\
\boldsymbol{CA} \\
\boldsymbol{CA}^{2} \\
\vdots \\
\boldsymbol{CA}^{p-1}
\end{bmatrix}
, \quad{}
\mathcal{O}_{p}(:-1)
=
\begin{bmatrix}
\boldsymbol{C} \\
\boldsymbol{CA} \\
\boldsymbol{CA}^{2} \\
\vdots \\
\boldsymbol{CA}^{p-2}
\end{bmatrix}
, \quad{}
\mathcal{O}_{p}(1:)
=
\begin{bmatrix}
\boldsymbol{CA} \\
\boldsymbol{CA}^{2} \\
\boldsymbol{CA}^{3} \\
\vdots \\
\boldsymbol{CA}^{p-1}
\end{bmatrix}
\end{aligned}\end{split}\]
\[\boldsymbol{A} = \mathcal{O}_{p}(:-1)^{+}\mathcal{O}_{p}(1:)\]
\[\boldsymbol{C} = \mathcal{O}_{p}(0)\]
Appendix: Manipulation of discrete-time system matrix equation into correlation matrix relationships
In (Juang 1997), the discrete-time
system matrix equation is manipulated into a form describing the
relationship between correlation matrices \(\boldsymbol{R}_{yy}\),
\(\boldsymbol{R}_{yu}\), \(\boldsymbol{R}_{uu}\), and
\(\boldsymbol{R}_{xx}\).
Post-multiplying the discrete-time system matrix
equation by
\(\frac{1}{N}\boldsymbol{U}_{p}^{T}(k)\):
\[\begin{split}\begin{aligned}
\frac{1}{N}\boldsymbol{Y}_{p}(k)\boldsymbol{U}_{p}^{T}(k) &= \mathcal{O}_{p}\frac{1}{N}\boldsymbol{X}(k)\boldsymbol{U}_{p}^{T}(k) + \mathcal{T}_{p}\frac{1}{N}\boldsymbol{U}_{p}(k)\boldsymbol{U}_{p}^{T}(k)
\\
\boldsymbol{R}_{yu} &= \mathcal{O}_{p}\boldsymbol{R}_{xu} + \mathcal{T}_{p}\boldsymbol{R}_{uu}
\\
\mathcal{T}_{p} &= \left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1}
\end{aligned}\end{split}\]
Post-multiplying by \(\frac{1}{N}\boldsymbol{Y}_{p}^{T}(k)\):
\[\begin{split}\begin{aligned}
\frac{1}{N}\boldsymbol{Y}_{p}(k)\boldsymbol{Y}_{p}^{T}(k) &= \mathcal{O}_{p}\frac{1}{N}\boldsymbol{X}(k)\boldsymbol{Y}_{p}^{T}(k) + \mathcal{T}_{p}\frac{1}{N}\boldsymbol{U}_{p}(k)\boldsymbol{Y}_{p}^{T}(k)
\\
\boldsymbol{R}_{yy} &= \mathcal{O}_{p}\boldsymbol{R}_{yx}^{T} + \mathcal{T}_{p}\boldsymbol{R}_{yu}^{T}
\\
\boldsymbol{R}_{yy} &= \mathcal{O}_{p}\boldsymbol{R}_{yx}^{T} + \left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T}
\end{aligned}\end{split}\]
Post-multiplying by \(\frac{1}{N}\boldsymbol{X}_{p}^{T}(k)\):
\[\begin{split}\begin{aligned}
\frac{1}{N}\boldsymbol{Y}_{p}(k)\boldsymbol{X}_{p}^{T}(k) &= \mathcal{O}_{p}\frac{1}{N}\boldsymbol{X}(k)\boldsymbol{X}_{p}^{T}(k) + \mathcal{T}_{p}\frac{1}{N}\boldsymbol{U}_{p}(k)\boldsymbol{X}_{p}^{T}(k)
\\
\boldsymbol{R}_{yx} &= \mathcal{O}_{p}\boldsymbol{R}_{xx} + \mathcal{T}_{p}\boldsymbol{R}_{xu}^{T}
\\
\boldsymbol{R}_{yx} &= \mathcal{O}_{p}\boldsymbol{R}_{xx} + \left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{xu}^{T}
\end{aligned}\end{split}\]
Substituting the equation for \(\boldsymbol{R}_{yx}\) into the equation
for \(\boldsymbol{R}_{yy}\):
\[\begin{split}\begin{aligned}
\boldsymbol{R}_{yy} =& ~\mathcal{O}_{p}
\left(\mathcal{O}_{p}\boldsymbol{R}_{xx} + \left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{xu}^{T}\right)^{T}
\\
&+
\left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T}
\\
=& ~\mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T}
+ \mathcal{O}_{p}\boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \left( \boldsymbol{R}_{yu}^{T} - \boldsymbol{R}_{xu}^{T}\mathcal{O}_{p}^{T} \right)
\\
&+
\left( \boldsymbol{R}_{yu} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \right)\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T}
\\
=& ~\mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T}
+ \mathcal{O}_{p}\boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \boldsymbol{R}_{yu}^{T} - \mathcal{O}_{p}\boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \boldsymbol{R}_{xu}^{T}\mathcal{O}_{p}^{T}
\\
&+
\boldsymbol{R}_{yu}\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} - \mathcal{O}_{p}\boldsymbol{R}_{xu} \boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T}
\\
=& ~\mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T}
- \mathcal{O}_{p}\boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \boldsymbol{R}_{xu}^{T}\mathcal{O}_{p}^{T} +
\boldsymbol{R}_{yu}\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T}
\end{aligned}\end{split}\]
Moving all of the terms that can be composed with measured data to the
left side:
\[\begin{split}\begin{aligned}
\boldsymbol{R}_{yy} - \boldsymbol{R}_{yu}\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T}
&= \mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T} - \mathcal{O}_{p}\boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \boldsymbol{R}_{xu}^{T}\mathcal{O}_{p}^{T} \\
&= \mathcal{O}_{p}\left( \boldsymbol{R}_{xx} - \boldsymbol{R}_{xu}\boldsymbol{R}_{uu}^{-1} \boldsymbol{R}_{xu}^{T} \right) \mathcal{O}_{p}^{T}
\end{aligned}\end{split}\]
We make the assumption that all current and future input data is
uncorrelated with the current state, which means that the average of the
products \(\boldsymbol{x}(k)\boldsymbol{u}(k+i), ~~ i \in [0,1,2,\dots]\) is
zero. This gives:
\[\begin{split}\begin{aligned}
\boldsymbol{R}_{xu} &=
\frac{1}{N}
\begin{bmatrix}
\sum_{j=0}^{N-1}\boldsymbol{x}(k+j)\boldsymbol{u}(k+j) \\
\sum_{j=0}^{N-1}\boldsymbol{x}(k+j)\boldsymbol{u}(k+j+1) \\
\sum_{j=0}^{N-1}\boldsymbol{x}(k+j)\boldsymbol{u}(k+j+2) \\
\vdots \\
\sum_{j=0}^{N-1}\boldsymbol{x}(k+j)\boldsymbol{u}(k+j+p-1)
\end{bmatrix}^{T} \\
&=
\boldsymbol{0}
\end{aligned}\end{split}\]
in order to yield:
\[\boldsymbol{R}_{yy} - \boldsymbol{R}_{yu}\boldsymbol{R}_{uu}^{-1}\boldsymbol{R}_{yu}^{T} = \mathcal{O}_{p}\boldsymbol{R}_{xx}\mathcal{O}_{p}^{T}~.\]