1 Overview

In this tutorial, we’ll cover the basics of matrix algebra through manual calculations as well as by using functions available inside of R for working with matrices. The code presented should enable you to manipulate and calculate many different types of matrices, such as adjoined matrices, standardized score matrices, and deviation score matrices.

Additionally, we’ll use some simple example data to cover how to create correlation matrices and variance/covariance matrices. We’ll also use this data to calculate a regression through matrix algebra.

1.1 Packages (and versions) used in this document

## [1] "R version 4.0.2 (2020-06-22)"
##  Package Version
##    psych   2.0.7

2 How to Make a Matrix

Making matrices is easy to do in R and doesn’t require any packages other than those that come with base R. We’ll start off by making two separate 2x2 matrices, A and B; matrix names are traditionally uppercase letters. Notice that R intuitively determines the matrix’s order based on the number of rows or columns specified - there’s no need to specify both rows and columns. Can you tell what happens when we specify byrow = TRUE instead of the default byrow = FALSE for matrix B?

A <- matrix(data = 1:4, ncol = 2)
A
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
B <- matrix(data = 5:8, nrow = 2, byrow = TRUE)
B
##      [,1] [,2]
## [1,]    5    6
## [2,]    7    8

Now that we know how to make matrices, we’ll make three more: a 2x3, a 3x2, and a 3x3. We’ll use all five of these matrices later to explore how matrix algebra is done in R.

C <- matrix(data = 1:6, ncol = 3)
C
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
D <- matrix(data = c(rep(x = 1:3, times = 2)), nrow = 3, byrow = TRUE)
D
##      [,1] [,2]
## [1,]    1    2
## [2,]    3    1
## [3,]    2    3
E <- matrix(data = c(rep(x =1:4, times = 2), 5), ncol = 3)
E
##      [,1] [,2] [,3]
## [1,]    1    4    3
## [2,]    2    1    4
## [3,]    3    2    5

As an aside, never asign anything to “T” or “F” because these are shorthand for the logical values TRUE and FALSE.

3 Matrix Addition (& Subtraction)

Adding and subtracting matrices in R is very straightforward. To perform element-wise addition and subtraction, you can simply add and subtract matrices like you would vectors or variables.

Z <- A + B
Z
##      [,1] [,2]
## [1,]    6    9
## [2,]    9   12

4 Matrix Multiplication

Multiplying matrices in R isn’t as straightforward. If we try to multiply them like we would vectors or variables, we’ll perform element-wise multiplication, which probably isn’t what you’re intending to do. We can see this with matrix W below.

W <- A * B
W
##      [,1] [,2]
## [1,]    5   18
## [2,]   14   32

To perform matrix multiplication, we must use a special operator that let’s R know that we’re not intending for element-wise multiplication. When doing this, you need to ensure that your matrices conform to the multiplication rule:

The number of columns of the pre-multiplied matrix must be equal to the number of rows of the post-multiplied matrix.

V <- A %*% B
V
##      [,1] [,2]
## [1,]   26   30
## [2,]   38   44

Now let’s try a bit more complex of a matrix multiplication problem. Notice how the dimensions of these matrices conform to the multiplication rule:
[2x2] * [2x3] * [3x3] * [3x2] * [2*2] = [2x2]

U <- A %*% C %*% E %*% D %*% B
U
##       [,1]  [,2]
## [1,] 10048 11702
## [2,] 14752 17180

5 Column (& Row) Sums

5.1 Column sums manually

To calculate column sums for matrix E we first need to make a row vector of the appropriate length containing only ones. Remember that vector names are traditionally lowercase letters.

ir <- matrix(c(rep(x = 1, times = 3)), nrow = 1)
ir
##      [,1] [,2] [,3]
## [1,]    1    1    1

We’ll now pre-multiply E by ir to create a vector that contains the sums of E’s columns.

ecs <- ir %*% E
E
##      [,1] [,2] [,3]
## [1,]    1    4    3
## [2,]    2    1    4
## [3,]    3    2    5
ecs
##      [,1] [,2] [,3]
## [1,]    6    7   12

5.2 Row sums manually

How might we calculate the sums of the rows?
We’ll make a column vector of 1s and do the same process as we did for the column sums.

ic <- matrix(c(rep(x = 1, times = 3)), ncol = 1)
ic
##      [,1]
## [1,]    1
## [2,]    1
## [3,]    1
ers <- E %*% ic
E
##      [,1] [,2] [,3]
## [1,]    1    4    3
## [2,]    2    1    4
## [3,]    3    2    5
ers
##      [,1]
## [1,]    8
## [2,]    7
## [3,]   10

5.3 Column and row sums quickly

It’s good to know how to do things manually, but it’s not always necessary in practice to do things the long way. So now we’ll look at couple of functions in R that will make our lives easier: colSums() and rowSums().

ecs2 <- colSums(E)
ecs2
## [1]  6  7 12
ers2 <- rowSums(E)
ers2
## [1]  8  7 10

Bear in mind that these functions return a numeric vector not a matrix vector. You can always check the class of an object with class().

class(ecs2)
## [1] "numeric"

Let’s fix this by letting R know that we want them to be matrices

ecs3 <- matrix(colSums(E), nrow = 1)
ers3 <- matrix(rowSums(E), ncol = 1)
class(ecs3)
## [1] "matrix" "array"

6 Column (& Row) Means

Similar to “regular algebra,” to get a mean, we need to divide by n. To do this in matrix algebra, we multiply by (1/n).

6.1 Means manually

ecm <- (1/3)*ecs3
ecm
##      [,1]     [,2] [,3]
## [1,]    2 2.333333    4
erm <- (1/3)*ers3
erm
##          [,1]
## [1,] 2.666667
## [2,] 2.333333
## [3,] 3.333333

6.2 Means quickly

Or we can use a function inside of R to get our answer a bit easier.

ecm2 <- colMeans(E)
ecm2
## [1] 2.000000 2.333333 4.000000
erm2 <- rowMeans(E)
erm2
## [1] 2.666667 2.333333 3.333333

7 Deviance Score Matrix

The deviance score matrix consists of our matrix E less the respective column means. To do this, we first need to make a matrix of column means the same order as E. We’ll just use our column of 1s from earlier to accomplish this.

ECM <- ic %*% ecm
ECM
##      [,1]     [,2] [,3]
## [1,]    2 2.333333    4
## [2,]    2 2.333333    4
## [3,]    2 2.333333    4

Now we can subtract this new matrix of means, ECM, from the original matrix, E. One way to check that this is correct is to take the sum of each column, which should be equal to 0.

devE <- E - ECM
devE
##      [,1]       [,2] [,3]
## [1,]   -1  1.6666667   -1
## [2,]    0 -1.3333333    0
## [3,]    1 -0.3333333    1

8 Calculating a Determinant

Before we get into the calculation, we should note that a determinant is a scalar (single-value matrix) that has a number of various purproses, but the most meaningful for us is that it helps in calculating the inverse of a matrix.

8.1 Determinant manually

8.1.1 Alphabetical example

Let’s first make two example matrices. Note that A is a 2x2 and B is a 3x3. Both of these are square matrices, and only square matrices have determinants.

A <- matrix(data = c("a", "b", "c", "d"), nrow = 2, byrow = TRUE)
A
##      [,1] [,2]
## [1,] "a"  "b" 
## [2,] "c"  "d"
B <- matrix(data = c("a", "b", "c", "d", "e", "f", "g", "h", "i"), nrow = 3, byrow = TRUE)
B
##      [,1] [,2] [,3]
## [1,] "a"  "b"  "c" 
## [2,] "d"  "e"  "f" 
## [3,] "g"  "h"  "i"

To calculate a determinant, we’re going to first calculate all of the left to right diagonal products and then subtract the right to left diagonal products from that. Calculating the determinant of a 2x2 is fairly straightforward. For matrix A, the determinant would be: (a*d)-(b*c).

Unfortunately, it gets a bit more complicated the larger your matrices get. For B, the determinant is: (a*e*i)+(b*f*g)+(c*d*h)-(c*e*g)-(b*d*i)-(a*f*h).

8.1.2 Numeric example

Now for an actual numeric example, we’ll create two matrices of the same dimensions as above.

D <- matrix(data = 1:4, nrow = 2, byrow = TRUE)
D
##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4
E <- matrix(data = (c((1:8), 10)), nrow = 3, byrow = TRUE)
E
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8   10

Thinking back to what we just learned, the determinant for matrix D should equal:

d <- (D[1,1]*D[2,2]) - (D[1,2]*D[2,1])
d
## [1] -2

Notice that this code is just calling on individual elements of matrix D. I’m sure you can see how this could quickly get out of hand for larger matrices.

e <- (E[1,1]*E[2,2]*E[3,3]) + (E[1,2]*E[2,3]*E[3,1]) + (E[1,3]*E[2,1]*E[3,2]) -
  (E[1,3]*E[2,2]*E[3,1]) - (E[1,2]*E[2,1]*E[3,3]) - (E[1,1]*E[2,3]*E[3,2])
e
## [1] -3

8.2 Determinant quickly

Thankfully, R makes this a bit easier for us with the det() function.

d2 <- det(x = D)
d2
## [1] -2
e2 <- det(x = E)
e2
## [1] -3

9 Calculating an Adjugate (adjoint) Matrix

Simply put, this is a matrix that will yield the determinant when multiplied by a different matrix.

9.1 Alphabetical example

Using our matrix A, we can demonstrate how this works. Notice that the primary diagonal is flipped, and the off diagonal became negative.

adjA <- matrix(data = c("d", "-b", "-c", "a"), nrow = 2, byrow = TRUE)
adjA
##      [,1] [,2]
## [1,] "d"  "-b"
## [2,] "-c" "a"

9.2 Numeric example

Let’s plug in the corresponding numbers from matrix D.

adjD <- matrix(data = c(4, -2, -3, 1), nrow = 2, byrow = TRUE)
adjD
##      [,1] [,2]
## [1,]    4   -2
## [2,]   -3    1

Multiplying our original D by adjD should yield a 2x2 matrix with our determinant on the diagonal.

D %*% adjD
##      [,1] [,2]
## [1,]   -2    0
## [2,]    0   -2
det(x = D)
## [1] -2

10 Calculating a Matrix Inverse

10.1 Inverse manually

Now, if we take what we did in the previous step. We can get our inverse. We’re going to divide our adjugate matrix by the determinant.

Di <- (1/d2) * adjD
Di
##      [,1] [,2]
## [1,] -2.0  1.0
## [2,]  1.5 -0.5

10.2 Inverse quickly

It’s good to understand how all of this works together, but R can calculate an inverse for you (if one exists).

Di2 <- solve(D)
Di2
##      [,1] [,2]
## [1,] -2.0  1.0
## [2,]  1.5 -0.5

Just to double check that everything worked correctly, we should be able to multiply our inverse by our original matrix and get the identity matrix (a matrix with 1s on the diagonal and 0s elsewhere) back.

I <- Di2 %*% D
I
##      [,1]         [,2]
## [1,]    1 4.440892e-16
## [2,]    0 1.000000e+00

This worked aside from the small rounding errors.

11 Other Important Matrix Functions in R

11.1 t()

You will often need to transpose your matrices/vectors. You can do this with t(). I named this “Ep” (for E-prime) in place of E’ because you can’t use that notation in R.

Ep <- t(x = E)
Ep
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6   10

11.2 diag()

The diag() function has multiple uses. First, we can use it to create a diagonal matrix with a value(x) on the diagonal. X can be a matrix, vector, or scalar.

S <- diag(x = 5, nrow = 3, ncol = 3)
S
##      [,1] [,2] [,3]
## [1,]    5    0    0
## [2,]    0    5    0
## [3,]    0    0    5

Stemming from the first point, we can use diag() to create an identity matrix.

I <- diag(x = 1, nrow = 3, ncol = 3)
I
##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    1    0
## [3,]    0    0    1

Finally, we can use it to create a vector of the diagonal values in a matrix. As earlier, this returns a numeric vector. You can add matrix() to change this.

s <- diag(x = S)
s
## [1] 5 5 5

11.3 solve()

Another handy function is solve(), this allows you to calculate an inverse.

Ei <- solve(E)
Ei
##            [,1]      [,2] [,3]
## [1,] -0.6666667 -1.333333    1
## [2,] -0.6666667  3.666667   -2
## [3,]  1.0000000 -2.000000    1

11.4 eigen()

Finally, we’ll talk about the eigen() function. This function will return a list that contains the eigenvalues as well as the eigenvectors of your specified matrix. To see the eigen values/vectors, you can subset them out of the list.

Eeigen <- eigen(x = E)
Eeigen$values
## [1] 16.7074933 -0.9057402  0.1982469
Eeigen$vectors
##            [,1]       [,2]       [,3]
## [1,] -0.2235134 -0.8658458  0.2782965
## [2,] -0.5039456  0.0856512 -0.8318468
## [3,] -0.8343144  0.4929249  0.4801895

12 Data Illustration

For this illustration, we’ll use the lunch data that is discussed by Wood (2019). This data represents ratings of lunch items over the course of 5 school days - higher numbers indicate higher satisfaction with the item.

To start things off we need to create our data matrix. We’ll also go ahead and create a column of 1s.

sandwich <- c(1:5)
beverage <- c(4, 7, 5, 8, 6)
dessert <- c(4, 2, 6, 10, 8)
L <- cbind(sandwich, beverage, dessert)
L
##      sandwich beverage dessert
## [1,]        1        4       4
## [2,]        2        7       2
## [3,]        3        5       6
## [4,]        4        8      10
## [5,]        5        6       8
ic <- matrix(c(rep(x = 1, times = 5)), ncol = 1)
ic
##      [,1]
## [1,]    1
## [2,]    1
## [3,]    1
## [4,]    1
## [5,]    1

12.1 Variance/covariance matrix manually

First, we need to create our deviation score matrix like we did earlier. Notice that in this one line of code, we’re calculating the column means, making a matrix of the appropriate size, and then subtracting that matrix from the original matrix.

X <- L - (ic %*% matrix(colMeans(L), nrow = 1))
X
##      sandwich beverage dessert
## [1,]       -2       -2      -2
## [2,]       -1        1      -4
## [3,]        0       -1       0
## [4,]        1        2       4
## [5,]        2        0       2

To calculate the variance/covariance matrix, we’re going to take our deviation matrix (X), pre-multiply its transpose, and divide by n-1: [1/(n-1)]*X'X

C <- 1/4*(t(X) %*% X)
C
##          sandwich beverage dessert
## sandwich     2.50     1.25       4
## beverage     1.25     2.50       2
## dessert      4.00     2.00      10

12.2 Correlation matrix manually

To calculate the correlation matrix, we first need to make a diagonal matrix to allow us to divide our deviation score matrix by the respective variable standard deviations.

S <- diag(x = c(1/sd(sandwich), 1/sd(beverage), 1/sd(dessert)), nrow = 3, ncol = 3)
S
##           [,1]      [,2]      [,3]
## [1,] 0.6324555 0.0000000 0.0000000
## [2,] 0.0000000 0.6324555 0.0000000
## [3,] 0.0000000 0.0000000 0.3162278

Next, we need to multiply our deviation matrix by this new matrix, S.

Z <- X %*% S
Z
##            [,1]       [,2]       [,3]
## [1,] -1.2649111 -1.2649111 -0.6324555
## [2,] -0.6324555  0.6324555 -1.2649111
## [3,]  0.0000000 -0.6324555  0.0000000
## [4,]  0.6324555  1.2649111  1.2649111
## [5,]  1.2649111  0.0000000  0.6324555

We’ll then premultiply this new matrix, Z, by its transpose and [1/(n-1)].

R <- (1/4)*(t(Z) %*% Z)
R
##      [,1] [,2] [,3]
## [1,]  1.0  0.5  0.8
## [2,]  0.5  1.0  0.4
## [3,]  0.8  0.4  1.0

12.3 Squared multiple correlation (SMC) manually

Sometimes you’ll need to calculate the SMC. This can be done by hand easily. We’ll use our previous correlation matrix, R.

SMC <- 1 - 1/diag(solve(R)) 
SMC
## [1] 0.6785714 0.2500000 0.6400000

12.4 Letting R make all of this easier

Now that we’ve had our fun, we can verify our matrix skills.

12.4.1 Variance/covariance matrix quickly

Compare your variance covariance matrix, C, with the result of either var() or cov() - both should return the variance/covariance matrix.

C2 <- var(x = X)
C3 <- cov(x = X)
all.equal(C, C2)
## [1] TRUE
all.equal(C, C3)
## [1] TRUE

12.4.2 Correlation matrix quickly

Similarly, we can check our correlation matrix with cor().

R2 <- cor(x = Z, method = "pearson")
all.equal(R, R2)
## [1] TRUE

12.4.3 Squared multiple correlation (SMC) quickly

There is a function, smc(), in the psych package that will do this for you.

SMC2 <- smc(R)
SMC2
##        V1        V2        V3 
## 0.6785714 0.2500000 0.6400000
SMC
## [1] 0.6785714 0.2500000 0.6400000

13 Regression through Matrix Algebra

To calculate a regression in matrix algebra, you will get very familiar with the following equation:

\((X'X)^{-1} * X'Y\) (read as: x-prime x inverse x-prime y)

13.1 Data preperation

For this example, we’ll use the lunch data as before. First, we’ll set up our data for our friend lm() - we’re making a copy of the same data in matrix L just in a data frame format.

dat <- as.data.frame(cbind(sandwich, beverage, dessert))
dat
##   sandwich beverage dessert
## 1        1        4       4
## 2        2        7       2
## 3        3        5       6
## 4        4        8      10
## 5        5        6       8

Now, we’ll set it up for matrix algebra. Here we’re creating an X matrix that includes the predictor variables, sandwich and beverage, and a Y matrix that includes the outcome variable, dessert. Don’t forget that we also need a column of 1s in X.

X <- cbind(c(rep(x = 1, times = 5)), sandwich, beverage)
X
##        sandwich beverage
## [1,] 1        1        4
## [2,] 1        2        7
## [3,] 1        3        5
## [4,] 1        4        8
## [5,] 1        5        6
Y <- matrix(data = dessert, ncol = 1)
Y
##      [,1]
## [1,]    4
## [2,]    2
## [3,]    6
## [4,]   10
## [5,]    8

13.2 Running the models

Now let’s put everything together and run our regression in Matrix form. Remember: \((X'X)^{-1} * X'Y\).
Note: Orders equal [3x5] * [5x3] * [3x5] * [5x1]
What should the order of b equal?

b <- solve((t(X) %*% X)) %*% t(X) %*% Y
b
##                   [,1]
##           1.200000e+00
## sandwich  1.600000e+00
## beverage -1.110223e-15

Now we’ll run the same model (sandwich and beverage predicting dessert) in lm().

m1 <- lm(dessert ~ sandwich + beverage, data = dat)
summary(m1)
## 
## Call:
## lm(formula = dessert ~ sandwich + beverage, data = dat)
## 
## Residuals:
##          1          2          3          4          5 
##  1.200e+00 -2.400e+00 -6.106e-16  2.400e+00 -1.200e+00 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  1.200e+00  5.231e+00   0.229    0.840
## sandwich     1.600e+00  9.798e-01   1.633    0.244
## beverage    -1.622e-16  9.798e-01   0.000    1.000
## 
## Residual standard error: 2.683 on 2 degrees of freedom
## Multiple R-squared:   0.64,  Adjusted R-squared:   0.28 
## F-statistic: 1.778 on 2 and 2 DF,  p-value: 0.36

Take a moment and compare the results. What do you notice?

As you can see, other than rounding errors, we get the exact same regression coefficients.