Skip to main content

Section6.3Similarity

Objectives
  1. Learn to interpret similar matrices geometrically.
  2. Understand the relationship between the eigenvalues, eigenvectors, and characteristic polynomials of similar matrices.
  3. Recipe: compute Ax in terms of B,C for A=CBC1.
  4. Picture: the geometry of similar matrices.
  5. Vocabulary: similarity.

Some matrices are easy to understand. For instance, a diagonal matrix

D=E2001/2F

just scales the coordinates of a vector: DAxyB=A2xy/2B. The purpose of most of the rest of this chapter is to understand complicated-looking matrices by analyzing to what extent they “behave like” simple matrices. For instance, the matrix

A=110E116914F

has eigenvalues 2 and 1/2, with corresponding eigenvectors v1=A2/31B and v2=A11B. Notice that

D(xe1+ye2)=xDe1+yDe2=2xe112ye2A(xv1+yv2)=xAv1+yAv2=2xv112yv2.

Using v1,v2 instead of the usual coordinates makes A “behave” like a diagonal matrix.

Figure1The matrices A and D behave similarly. Click “multiply” to multiply the colored points by D on the left and A on the right. (We will see in Section 6.4 why the points follow hyperbolic paths.)

The other case of particular importance will be matrices that “behave” like a rotation matrix: indeed, this will be crucial for understanding Section 6.5 geometrically. See this important note.

In this section, we study in detail the situation when two matrices behave similarly with respect to different coordinate systems. In Section 6.4 and Section 6.5, we will show how to use eigenvalues and eigenvectors to find a simpler matrix that behaves like a given matrix.

Subsection6.3.1Similar Matrices

We begin with the algebraic definition of similarity.

Definition

Two n×n matrices A and B are similar if there exists an invertible n×n matrix C such that A=CBC1.

As in the above example, one can show that In is the only matrix that is similar to In, and likewise for any scalar multiple of In.

Similarity is unrelated to row equivalence. Any invertible matrix is row equivalent to In, but In is the only matrix similar to In. For instance,

E2102FandE1001F

are row equivalent but not similar.

As suggested by its name, similarity is what is called an equivalence relation. This means that it satisfies the following properties.

We conclude with an observation about similarity and powers of matrices.

Proof

First note that

A2=AA=(CBC1)(CBC1)=CB(C1C)BC1=CBInBC1=CB2C1.

Next we have

A3=A2A=(CB2C1)(CBC1)=CB2(C1C)BC1=CB3C1.

The pattern is clear.

Subsection6.3.2Geometry of Similar Matrices

Similarity is a very interesting construction when viewed geometrically. We will see that, roughly, similar matrices do the same thing in different coordinate systems. The reader might want to review B -coordinates and nonstandard coordinate grids in Section 3.5 and well as (B,C) -matrices in Section 4.7 before reading this subsection.

Recall that (by conditions 4 and 5 of the invertible matrix theorem in Section 6.1) an n×n matrix C is invertible if and only if its columns v1,v2,...,vn form a basis for Rn. This means we can speak of the C -coordinates of a vector in Rn, where C is the basis of columns of C. Recall from Section 4.7 that this means

CC[x]=xandC1x=C[x].
Observation

If the linear map T:RnRn has as standard matrix A, and C is the n×n matrix with columns given by the basis C, then the (C,C) -matrix of T is

B:=C[T]C=C[Id]EE[T]EE[Id]C=C1AC,

so A=CBC1.

In other words, two n×n -matrices A and B are similar if and only they represent the same linear map T:RnRn, but expressed in different bases.

Let's now illustrate this more concretely. Suppose that A=CBC1. The above observation gives us another way of computing Ax for a vector x in Rn. Recall that CBC1x=C(B(C1x)), so that multiplying CBC1 by x means first multiplying by C1, then by B, then by C. See this example in Section 4.4.

Recipe: Computing Ax in terms of B

Suppose that A=CBC1, where C is an invertible matrix with columns v1,v2,...,vn, and let C=(v1,v2,...,vn) be the corresponding basis for Rn. Let x be a vector in Rn. To compute Ax, one does the following:

  1. Multiply x by C1, which changes to the C -coordinates: C[x]=C1x.
  2. Multiply this by B: BC[x]=BC1x.
  3. Interpreting this vector as a C -coordinate vector, we multiply it by C to change back to the usual coordinates: Ax=CBC1x=CBC[x].

C-coordinatesC[x]BC[x]multiplybyC1multiplybyCusualcoordinatesxAx

To summarize: if A=CBC1, then A and B do the same thing, only in different coordinate systems.

The following example is the heart of this section.

Example

Consider the matrices

A=E1/23/23/21/2FB=E2001FC=E1111F.

One can verify that A=CBC1: see this example in Section 6.4. Let v1=A11B and v2=A11B, the columns of C, and let C=(v1,v2), a basis of R2.

The matrix B is diagonal: it scales the x -direction by a factor of 2 and the y -direction by a factor of 1.

e1e2Be1Be2B

To compute Ax, first we multiply by C1 to find the C -coordinates of x, then we multiply by B, then we multiply by C again. For instance, let x=A02B.

  1. We see from the C -coordinate grid below that x=v1+v2. Therefore, C1x=C[x]=A11B.
  2. Multiplying by B scales the coordinates: BC[x]=A21B.
  3. Interpreting A21B as a C -coordinate vector, we multiply by C to get
    Ax=CE21F=2v1v2=E31F.
    Of course, this vector lies at (2,1) on the C -coordinate grid.

C-coordinatesC[x]BC[x]multiplybyC1scalexby2scaleyby1multiplybyCusualcoordinatesxAx

Now let x=12A53B.

  1. We see from the C -coordinate grid that x=12v1+2v2. Therefore, C1x=C[x]=A1/22B.
  2. Multiplying by B scales the coordinates: BC[x]=A12B.
  3. Interpreting A12B as a C -coordinate vector, we multiply by C to get
    Ax=CE12F=v12v2=E13F.
    This vector lies at (1,2) on the C -coordinate grid.

C-coordinatesC[x]BC[x]multiplybyC1scalexby2scaleyby1multiplybyCusualcoordinatesxAx

To summarize:

  • B scales the e1 -direction by 2 and the e2 -direction by 1.
  • A scales the v1 -direction by 2 and the v2 -direction by 1.

e1e2Be1Be2v1v2Av1Av2BAC1C
Figure13The geometric relationship between the similar matrices A and B acting on R2. Click and drag the heads of x and C[x]. Study this picture until you can reliably predict where the other three vectors will be after moving one of them: this is the essence of the geometry of similar matrices.

To summarize and generalize the previous example:

A Matrix Similar to a Rotation Matrix

Let

B=EcosθsinθsinθcosθFC=C||v1v2||DA=CBC1,

where C is assumed invertible. Then:

  • B rotates the plane by an angle of θ around the circle centred at the origin and passing through e1 and e2, in the direction from e1 to e2.
  • A rotates the plane by an angle of θ around the ellipse centred at the origin and passing through v1 and v2, in the direction from v1 to v2.

e1e2Be1Be2v1v2Av1Av2BAC1C

Subsection6.3.3Eigenvalues of Similar Matrices

Since similar matrices behave in the same way with respect to different coordinate systems, we should expect their eigenvalues and eigenvectors to be closely related.

Proof

Suppose that A=CBC1, where A,B,C are n×n matrices. We calculate

AλIn=CBC1λCC1=CBC1CλC1=CBC1CλInC1=C(BλIn)C1.

Therefore,

det(AλIn)=det(C(BλIn)C1)=det(C)det(BλIn)det(C)1=det(BλIn).

Here we have used the multiplicativity property in Section 5.1 and its corollary in Section 5.1.

Since the eigenvalues of a matrix are the roots of its characteristic polynomial, we have shown:

Similar matrices have the same eigenvalues.

By this theorem in Section 6.2, similar matrices also have the same trace and determinant. Both of these observations also follow from the fact that similar matrices represent the same linear endomorphism considered with respect to different bases, and the determinant, trace, and characteristic polynomial don't depend on the choice of basis.

Note

The converse of the fact is false. Indeed, the matrices

E1101FandE1001F

both have characteristic polynomial f(λ)=(λ1)2, but they are not similar, because the only matrix that is similar to I2 is I2 itself.

Given that similar matrices have the same eigenvalues, one might guess that they have the same eigenvectors as well. Upon reflection, this is not what one should expect: indeed, the eigenvectors should only match up after changing from one coordinate system to another. This is the content of the next fact, remembering that C and C1 change between the usual coordinates and the C -coordinates.

Proof

Suppose that v is an eigenvector of A with eigenvalue λ, so that Av=λv. Then

B(C1v)=C1(CBC1v)=C1(Av)=C1λv=λ(C1v),

so that C1v is an eigenvector of B with eigenvalue λ. Likewise if v is an eigenvector of B with eigenvalue λ, then Bv=λv, and we have

A(Cv)=(CBC1)Cv=CBv=C(λv)=λ(Cv),

so that Cv is an eigenvalue of A with eigenvalue λ.

If A=CBC1, then C1 takes the λ -eigenspace of A to the λ -eigenspace of B, and C takes the λ -eigenspace of B to the λ -eigenspace of A.

Example

We continue with the above example: let

A=E1/23/23/21/2FB=E2001FC=E1111F,

so A=CBC1. Let v1=A11B and v2=A11B, the columns of C. Recall that:

  • B scales the e1 -direction by 2 and the e2 -direction by 1.
  • A scales the v1 -direction by 2 and the v2 -direction by 1.

This means that the x -axis is the 2 -eigenspace of B, and the y -axis is the 1 -eigenspace of B; likewise, the v1 -axis” is the 2 -eigenspace of A, and the v2 -axis” is the 1 -eigenspace of A. This is consistent with the fact, as multiplication by C changes e1 into Ce1=v1 and e2 into Ce2=v2.

2-eigenspace1-eigenspace2-eigenspace1-eigenspaceC
Figure27The eigenspaces of A are the lines through v1 and v2. These are the images under C of the coordinate axes, which are the eigenspaces of B.