I gave the following problem to students:

Two $n\times n$ matrices $A$ and $B$ are similar if there exists a nonsingular matrix $P$ such that $A=P^{-1}BP$.

  1. Prove that if $A$ and $B$ are two similar $n\times n$ matrices, then they have the same determinant and the same trace.

  2. Give an example of two $2\times 2$ matrices $A$ and $B$ with same determinant, same trace but that are not similar.

Most of the ~20 students got the first question right. However, almost none of them found a correct example to the second question. Most of them gave examples of matrices that have same determinant and same trace.

But computations show that their examples are similar matrices. They didn't bother to check that though, so they just tried random matrices with same trace and same determinant, hoping it would be a correct example.

Question: how to explain that none of the random trial gave non similar matrices?

Any answer based on density or measure theory is fine. In particular, you can assume any reasonable distribution on the entries of the matrix. If it matters, the course is about matrices with real coefficients, but you can assume integer coefficients, since when choosing numbers at random, most people will choose integers.

Rodrigo de Azevedo
  • 18,977
  • 5
  • 36
  • 95
  • 8,900
  • 3
  • 27
  • 50
  • 1
    Choosing an integer uniformly at random isn't well defined. I would suggest a log distribution as being vaguely realistic – Stella Biderman Apr 10 '17 at 01:43
  • 1
    You can choose any reasonable distribution. That is what I meant by "any answer (..) based on measure theory", but I'll edit the question to make it clearer. – Taladris Apr 10 '17 at 01:51
  • 9
    I doubt any typical distribution comes near how your students chose random matrix. Also, you shouldn't neglect the fact that "solution to difficult problems" are copied by many. – user251257 Apr 10 '17 at 01:55
  • What did you think were the odds that a random $2\times2$ would be similar to a Jordan block? – J. M. ain't a mathematician Apr 10 '17 at 05:42
  • 4
    You could try asking on the math educators site, as well/instead, if you like. – Please stop being evil Apr 10 '17 at 05:58
  • Taldris, I would definitely say that those students are not a random sample, unless you want us to completely disregard some common factors, such as you(their professor), etc... – An old man in the sea. Apr 10 '17 at 07:49
  • 20
    Is this scenario real or just a fun way to ask the question? – Carsten S Apr 10 '17 at 12:42
  • 24
    You should be careful with "*random*", as the meaning in mathematics and in normal conversation is different, and certain people are liable to misinterpret you. Your question has really nothing to do with random (in the mathematical sense) matrices, as there isn't really a draw-from-a-defined-distribution going on here. In these sorts of situations I like "*arbitrary*" as it conveys the same normal-conversation sense as (non-mathematical) "random" does, but doesn't have the same baggage. (There's a formal mathematical sense, but it matches more closely the normal conversational sense.) – R.M. Apr 10 '17 at 15:15
  • "Random matrices" actually means stochastic. It seems to me that you don't literally mean the students tried to use techniques from random matrices..... or do you? – Thompson Apr 10 '17 at 15:20
  • If, by "random," you mean "arbitrary," I suggest changing the title to more accurately reflect the mathematical core of the question. – apnorton Apr 10 '17 at 15:25
  • 14
    In a certain sense, it should be the most natural starting point to look for an example to (2) with $A$ the zero matrix. It is easy to see that no non-zero matrix is similar to it, so then all you have to do is make a non-zero matrix with zero trace and determinant and then you're done. And it's really easy to arrange that. So in some sense they failed to recognize how to reduce the problem's complexity: first look for a matrix which is similar to very few matrices. Doing that should quickly lead you to either the identity or zero matrix, at which point finding B is easy. – zibadawa timmy Apr 10 '17 at 15:32
  • 1
    And, btw, your title asks a very different question from what's in your actual text. – zibadawa timmy Apr 10 '17 at 15:33
  • Why counter-example? Such is usually used to proof a statement is false. – mvw Apr 13 '17 at 18:34
  • @zibadawatimmy If one didn't think of that, do you think there is another way to think about it that leads you to a solution? – Ovi Sep 28 '19 at 03:21

3 Answers3


If $A$ is a $2\times 2$ matrix with determinant $d$ and trace $t$, then the characteristic polynomial of $A$ is $x^2-tx+d$. If this polynomial has distinct roots (over $\mathbb{C}$), then $A$ has distinct eigenvalues and hence is diagonalizable (over $\mathbb{C}$). In particular, if $d$ and $t$ are such that the characteristic polynomial has distinct roots, then any other $B$ with the same determinant and trace is similar to $A$, since they are diagonalizable with the same eigenvalues.

So to give a correct example in part (2), you need $x^2-tx+d$ to have a double root, which happens only when the discriminant $t^2-4d$ is $0$. If you choose the matrix $A$ (or the values of $t$ and $d$) "at random" in any reasonable way, then $t^2-4d$ will usually not be $0$. (For instance, if you choose $A$'s entries uniformly from some interval, then $t^2-4d$ will be nonzero with probability $1$, since the vanishing set in $\mathbb{R}^n$ of any nonzero polynomial in $n$ variables has Lebesgue measure $0$.) Assuming that students did something like pick $A$ "at random" and then built $B$ to have the same trace and discriminant, this would explain why none of them found a correct example.

Note that this is very much special to $2\times 2$ matrices. In higher dimensions, the determinant and trace do not determine the characteristic polynomial (they just give two of the coefficients), and so if you pick two matrices with the same determinant and trace they will typically have different characteristic polynomials and not be similar.

Eric Wofsey
  • 295,450
  • 24
  • 356
  • 562
  • 11
    Even if the two matrices have the same trace and determinant, they are still almost surely similar. (If you bring both matrices to upper triangular form, there is only one free variable. Unless the free off-diagonal entry is zero for one matrix and non-zero for the other, the matrices are similar.) Perhaps I should add this to my answer... – Joonas Ilmavirta Apr 10 '17 at 06:06
  • @EricWofsey Could you prove your last sentence? "Typically" seems reasonable to me, but spontaneously, I do not see a way to prove your statement formally for higher dimensions. – user7427029 Jan 19 '22 at 23:29
  • @user7427029: Here is a sketch of one way you can formalize it. Consider the map $f:M_n(\mathbb{R})\to\mathbb{R}^n$ sending a matrix to the coefficients of its characteristic polynomial. The differential of $f$ is surjective at any companion matrix, and thus at any matrix conjugate to a companion matrix. In particular, $f$ is a submersion when restricted to the full measure dense open subset of $M_n(\mathbb{R})$ consisting of matrices with distinct eigenvalues. – Eric Wofsey Jan 20 '22 at 00:20
  • In particular, for instance, this means that if you restrict to matrices with distinct eigenvalues (which is "almost all" matrices in either a topological sense or a measure theoretical sense), the set of matrices with a given trace and determinant is a codimension 2 submanifold, whereas the set of matrices with an entire given characteristic polynomial is a codimension $n$ submanifold. – Eric Wofsey Jan 20 '22 at 00:21
  • Many thanks! What means "full measure" in "full measure dense open subset"? (I presume that the words "full measure" belong together.) In the end, if I understood it correctly, it's mainly an "submanifold of strictly less dimension has Haar measure zero" argument? – user7427029 Jan 20 '22 at 15:42
  • @user7427029: "Full measure" means its complement has measure $0$ (with respect to Lebesgue measure, in this case). I don't know what you mean by "Haar measure" here--these submanifolds do not have any natural group structure. But for instance, you could give the set of matrices with given trace and determinant the $(n^2-2)$-dimensional volume measure and then the subset of matrices with a given characteristic polynomial will have measure $0$. – Eric Wofsey Jan 20 '22 at 19:55

As Eric points out, such $2\times2$ matrices are special. In fact, there are only two such pairs of matrices. The number depends on how you count, but the point is that such matrices have a very special form.

Eric proved that the two matrices must have a double eigenvalue. Let the eigenvalue be $\lambda$. It is a little exercise1 to show that $2\times2$ matrices with double eigenvalue $\lambda$ are similar to a matrix of the form $$ C_{\lambda,\mu} = \begin{pmatrix} \lambda&\mu\\ 0&\lambda \end{pmatrix}. $$ Using suitable diagonal matrices shows that $C_{\lambda,\mu}$ is similar to $C_{\lambda,1}$ if $\mu\neq0$. On the other hand, $C_{\lambda,0}$ and $C_{\lambda,1}$ are not similar; one is a scaling and the other one is not.

Therefore, up to similarity transformations, the only possible example is $A=C_{\lambda,0}$ and $B=C_{\lambda,1}$ (or vice versa). Since scaling doesn't really change anything, the only examples (up to similarity, scaling, and swapping the two matrices) are $$ A = \begin{pmatrix} 1&0\\ 0&1 \end{pmatrix}, \quad B = \begin{pmatrix} 1&1\\ 0&1 \end{pmatrix} $$ and $$ A = \begin{pmatrix} 0&0\\ 0&0 \end{pmatrix}, \quad B = \begin{pmatrix} 0&1\\ 0&0 \end{pmatrix}. $$ If adding multiples of the identity is added to the list of symmetries (then scaling can be removed), then there is only one matrix pair up to the symmetries.

If you are familiar with the Jordan normal form, it gives a different way to see it. Once the eigenvalues are fixed to be equal, the only free property (up to similarity) is whether there are one or two blocks in the normal form. The Jordan normal form is invariant under similarity transformations, so it gives a very quick way to solve problems like this.

1 You only need to show that any matrix is similar to an upper triangular matrix. The eigenvalues (which now coincide) are on the diagonal. You can skip this exercise if you have Jordan normal forms at your disposal.

Joonas Ilmavirta
  • 24,864
  • 10
  • 52
  • 97
  • 11
    I don't buy your "scaling does not change anything" argument. By the same token I could say adding a multiple of the identity matrix does not change anything, and your two examples would become a single example. Better be honest about it: there are infinitely many examples, but the are all of a very special form where one is $\lambda I_2$ for some scalar $\lambda$ and the other is a different matrix with the same characteristic polynomial (non diagonalisable $2\times2$ matrix with double eigenvalue $\lambda$, of which there are a lot but all of which are similar). – Marc van Leeuwen Apr 10 '17 at 11:45
  • 6
    @MarcvanLeeuwen Whether or not scaling changes anything is a matter of opinion and the specific problem. But you are absolutely right; you can view the thing in several ways. If adding multiples of the identity is included in the list of symmetries, there is only one example. My only point really was that Eric showed that the OP's matrices must have be of a special form, but in fact they have to be of a *very* special form. – Joonas Ilmavirta Apr 10 '17 at 17:38

I'm going to answer your title question—Why did no student get the correct answer?—, rather than what's in your post, because while it's worthwhile to expound on why what they did failed, it completely overlooks the possibility that they were not taught how to handle such a problem in the first place.

Finding examples is a technique and art all of its own, and needs some instruction to reliably execute on exams (or anywhere else). So you may be passing the buck. Certainly sometimes you are a perfect teacher for a problem and the class still fails to get it, because sometimes by random (mis)fortune your class just happens to be full of students who have difficulties for reasons independent of you. But it's worthwhile to wonder if maybe there's something about how they were taught that needs work. You don't seem to address this possibility in your post, so here's a bunch of possibilities to mull over...

The basic line of thought that should ideally occur to them is "dang, figuring out whether two matrices are similar or not is usually a pretty involved process. I don't have the time for that. Maybe there are matrices which aren't similar to very many others...maybe ones that are only similar to themselves...well $PAP^{-1}$ is always easy to compute if $A$...", at which point hopefully they recognize they want the identity matrix, or a zero matrix, or more generally a scalar multiple of the identity (aka: the center of the matrix ring).

Students usually need to be taught the "usual suspects" for (counter)examples. In most fields of math, when wondering if something is true or not, there tends to be a certain object or set of objects which are known to often provide (counter)examples, or for which computations are relatively simple, and the first litmus test for a proposed theorem is to check it against the usual suspects to make sure it holds up. When introducing similar matrices, did you specifically point out (or assign to them as work in some fashion) what the identity and zero matrices are similar to? What scalar multiples of the identity are similar to?

When discussing similarity, did you make it clear that it is very hard to tell "at a glance" if two matrices are similar or not? That "very different looking" matrices can turn out to be similar? That determining if two fixed, small matrices are similar or not can be reasonable (for an exam), but with more matrices (in this case: literally every matrix they can think of) and more dimensions it becomes increasingly difficult to just brute force things?

Did you ever give examples of non-similar (2x2) matrices with identical trace and determinant in class? They might have been able to memorize such examples, and if they did then hopefully while doing so they incidentally picked up something about "why" these examples have the desired properties. Especially so if they were informed ahead of time that they are expected to show work, and not simply write down answers, to get full credit.

I'm assuming this question came from an exam or quiz. Is this question something you came up long before the exam/quiz date came around? Many instructors I know find it beneficial to write up exams (very) far in advance of when the actual exam is held, so that they can clearly identify: "Obviously I want them to know X, so I better make it a point to teach them how to do X." An exam/quiz created too close to the exam date runs the risk of you asking something you never actually taught them how to do; and, quite importantly, runs the risk of you not noticing this until it's too late. It's easy to think, "Yeah, they know everything they need to know for this, it'll be an easy problem" soon after writing the problem, but a week or two later you might realize you're wrong. If you've written well in advance, you can adjust the problem or lectures to fix this. If you haven't, well, then you've got an entire chunk of points on your exam that were exactly meaningless and a class that has a mistaken impression of their own abilities and a dissatisfaction with your ability to write a fair exam.

zibadawa timmy
  • 4,337
  • 16
  • 28
  • 4
    Regarding your last paragraph: I usually include one question in my high-school tests to "separate the adults from the kids" (or "the men from the boys"). One question checks if they can apply their knowledge to a new situation that I have not directly covered in class, if they can use a spark of creativity, if they can combine multiple ideas that I have not combined for them. Getting an A is supposed to be difficult. – Rory Daulton Apr 10 '17 at 16:30
  • @RoryDaulton I think your way of thinking has merit. That said, the OP seems to be expecting that the students should have been able to solve this problem. If the OP included such a question, this was apparently not it (even if it would have been a good candidate for one). – jpmc26 Apr 11 '17 at 05:45
  • 3
    +1 for addressing the question the OP actually asked. Eloquently, too. – Ethan Bolker Apr 11 '17 at 13:35
  • 4
    @jpmc26: In particular, the OP did not in fact separate the grups from the onlies, because *no one* answered the question correctly. I think this answer has hit on a vital concern—not everyone has to answer the question correctly, but everyone should have gotten the training they needed to answer it correctly. Whether they execute or not is of course a separate matter. To be sure, what "the training they needed" is, exactly, is an interesting question. We each have different notions of what the student needs to bring. But the fact that none answered correctly is telling, I think. – Brian Tung Apr 11 '17 at 21:50
  • @jpmc26: (I'm agreeing with your assessment, by the way.) – Brian Tung Apr 11 '17 at 21:50