Blockwise Matrix Inversion

I’m taking a Statistics course on the theory of linear models, which covers Gauss-Markov models and various extensions of them. Sometimes, when dealing with partitioned matrices, and commonly Multivariate Normal Distributions, we’ll often need to invert matrices in a blockwise manner. This has happened often enough during this course (coincidentally was necessary knowledge for a midterm question), so I figured I should just document some of the inversion lemmas.

Let’s define our partitioned matrix as

$$ R = \begin{bmatrix} A & B \\
C & D \end{bmatrix}$$

We specifically interested in finding

$$ R^{-1} = \begin{bmatrix} W & X \\
Y & Z \end{bmatrix}$$

such that

$$ R R^{-1} = R^{-1}R = \begin{bmatrix} I & 0 \\
0 & I \end{bmatrix}$$

Part 1: $R R^{-1}$

For the right inverse ($R R^{-1}$), we can define

$$ \begin{aligned} AW + BY = I \\
AX + BZ = 0 \\
CW + DY = 0 \\
CX + DZ = I \\
\end{aligned} $$

and, assuming $A$ and $D$ are invertible,

$$\begin{aligned} X = -A^{-1}BZ \\
Y = -D^{-1}CW \\
\end{aligned}$$

We can plug these identities back into the first system of equations as

$$\begin{aligned} AW + B(-D^{-1}CW) &= (A - BD^{-1}C)W = I \\
C(-A^{-1}BZ) + DZ &= (D - CA^{-1}B)Z = I \\
\end{aligned}$$

so that

$$\begin{aligned} W = (A-BD^{-1}C)^{-1} \\
Z = (D-CA^{-1}B)^{-1} \\
\end{aligned}$$

and finally

$$ R^{-1} = \begin{bmatrix} W & X \\
Y & Z \end{bmatrix} = \begin{bmatrix} (A-BD^{-1}C)^{-1} & -A^{-1}B(D-CA^{-1}B)^{-1} \\
-D^{-1}C(A-BD^{-1}C)^{-1} & (D-CA^{-1}B)^{-1} \\
\end{bmatrix}$$

It is important to note that the above result only holds if $A$, $D$, $(D-CA^{-1}B)$, and $(A-BD^{-1}C)$ are invertible.

Part 2: $R^{-1} R$

Following the same logic as above, we have the following systems of equations for the left inverse ($R^{-1}R$)

$$\begin{aligned} WA + XC = I \\
WB + XD = 0 \\
YA + ZC = 0 \\
YB + ZD = I \\
\end{aligned}$$

so that

$$\begin{aligned} X = WBD^{-1} = A^{-1}BZ \\
Y = ZCA^{-1} = D^{-1}CW \\
\end{aligned}$$

which indicates that

$$\begin{aligned} W = (A-BD^{-1}C)^{-1} = C^{-1}D(D-CA^{-1}B)^{-1}CA^{-1} \\
X = (A-BD^{-1}C)^{-1}BD^{-1} = A^{-1}B(D-CA^{-1}B)^{-1} \\
\end{aligned}$$

Importantly, blockwise matrix inversion allows us to define the inverse of a larger matrix, with respect to its subcomponents. Likewise, from here, we can go on to derive the Sherman-Morrison formula and Woodbury theorem, which allows us to do all kinds of cool stuff, like rank-one matrix updates. In the next few posts, I’ll go over a few examples of where blockwise matrix inversions are useful, and common scenarios where rank-one updates of matrices are applicable in the next few posts.

Data Scientist