A is matrix (m rows, n cols), each row is an object, and each cols is a feature (a dimension). Typically, I compute the pca based on the covariance matrix, that is A'A, A' is the transposed matrix of A.

Today I read a book which presents a useful trick to compute pca, that is if n >> m, then we can compute the eigenvectors of the matrix AA', which might save a lot of memory, here is the code from the book:

def pca(X):
    Principal Component Analysis
    input: X, matrix with training data stored as flattened arrays in rows
    return: projection matrix (with important dimensions first), variance
    and mean.
    # get dimensions
    num_data,dim = X.shape
    # center data
    mean_X = X.mean(axis=0)
    X = X - mean_X

    # PCA - compact trick used
    M = dot(X,X.T)        # covariance matrix, AA', not the A'A like usual
    e,EV = linalg.eigh(M) # compute eigenvalues and eigenvectors
    tmp = dot(X.T,EV).T   # this is the compact trick
    V = tmp[::-1]         # reverse since last eigenvectors are the ones we want
    S = sqrt(e)[::-1]     # reverse since eigenvalues are in increasing order
    for i in range(V.shape[1]):
        V[:,i] /= S       # What for?

    # return the projection matrix, the variance and the mean
    return V,S,mean_X

Now I understand the algebra behind this useful trick, but there is something confuses me, that is the for-loop, why divide V by S? Normolize the V to unit-length?

  • 1,109
  • 2
  • 18
  • 31

1 Answers1


Yes, this is normalization. Consider that $V$ were obtained from the eigenvectors of $AA^T$. Let $v$ be a unit norm eigenvector for $AA^T$. Since $AA^Tv=\lambda v$, multiplying by $A^T$ on the left we obtain $A^TA(A^Tv)=\lambda(A^T v)$. Thus, $A^Tv$ is an eigenvector for $A^TA$. However it is not a unit vector: multiplication stretches it by $\lambda^{1/2}$. Dividing it by $\lambda^{1/2}$, we get a unit eigenvector for $A^TA$.

  • 2,844
  • 1
  • 30
  • 144
  • Excuse me, I don't think I get it. Why the multiplication stretches it by λ^(1/2)? – avocado Jun 02 '13 at 12:26
  • @loganecolss Because the eigenvectors of $AA^T$ are the [left singular vectors](http://en.wikipedia.org/wiki/Singular_value_decomposition#Singular_values.2C_singular_vectors.2C_and_their_relation_to_the_SVD) of $A$. (In Wikipedia notation, $\sigma=\lambda^{1/2}$, a singular value of $A$.) – ˈjuː.zɚ79365 Jun 02 '13 at 12:30
  • Finally I understand, thank you. – avocado Jun 02 '13 at 13:52