Let $f: \mathbb{R}^n \mapsto \mathbb{R}$ a convex a continuously differentiable function.

What is the relation between the component Lipschitz constants, i.e. the smallest constants $L_i$ such that for all $x \in \mathbb{R}^n$ and $t \in \mathbb{R}$: \begin{equation} |[\nabla f(x+te_i)]_i-[\nabla f(x)]_i| \leq L_i|t|, \end{equation} and the regular Lipschitz constant, i.e. the smallest constant $L$ such that for all $d \in \mathbb{R}^n$: \begin{equation} \|\nabla f(x+d)-\nabla f(x)\|_2 \leq L\|d\|_2? \end{equation}

It is mentioned it this paper that $1 \leq \frac{L}{L_{\text{max}}} \leq n$ where $L_{\text{max}} = \max_{i=1..n}L_i$, which comes from the "relationships between norm and trace of a symmetric matrix" (p.12). However I do not see where this result comes from, or which "symmetric matrix" the author refers to.

  • 103
  • 5
  • Notice that Lipschitz constants aren't unique. If $L$ is a Lipschitz constant then any $M> L$ is also a Lipschitz constant. So please be more specific about $L$ and $L_i$. – user251257 May 18 '18 at 19:09

2 Answers2


Since $M=L\text{I} - \nabla^2 f \succeq 0$, all diagonal entries of $M$ must be non-negative. Thus, $L \geq L_{\max}$. The equality holds when, for example, $f(x)=\frac{1}{2} \sum_{i=1}^n x_i^2.$

The second inequality comes from

$$ L = \max_i \lambda_i \leq \sum_{i=1}^n \lambda_i = tr(\nabla^2 f) = \sum_{i=1}^n \partial^2 f_i \leq n L_{\max} , $$

where $\lambda_i \geq 0$ is the ith eigenvalue of the Hessian. When $f(x)= \frac{1}{2} (\sum_{i=1}^n x_i)^2 = \frac{1}{2} (1^T x)^2$, the equality holds.

Trung Vu
  • 66
  • 4
  • Why does the sum of the eigenvalues equal the trace of the hessian? The hessian is symmetric but does not have to be diagonal?!? – mathsstudent98 Jul 03 '20 at 17:03
  • 1
    https://math.stackexchange.com/questions/546155/proof-that-the-trace-of-a-matrix-is-the-sum-of-its-eigenvalues – Trung Vu Jul 04 '20 at 18:26
  • Many thanks for this link. I did not expect that you would answer since this thread is 2 years old. So again many thanks. Are you familiar with the above mentioned paper? – mathsstudent98 Jul 04 '20 at 18:34
  • 1
    Yes. I read it 2 years ago... – Trung Vu Jul 04 '20 at 18:55
  • I am reading it currently. There are some parts where I just cant follow him, like here: https://math.stackexchange.com/questions/3729932/accelerated-randomized-coordinate-descent Any help is much appreciated. – mathsstudent98 Jul 04 '20 at 20:19
  • 1
    $f$ here doesn't need to be quadratic – Trung Vu Jul 05 '20 at 08:24

So does the above mentioned relationship $1 \leq \frac{L}{L_{\text{max}}} \leq n$ hold for every convex, twice differentiable function with Lipschitz-continuous gradient or only for quadratic convex functions?

On page 20 of this paper it says:

"We noted in Subsection 3.2 that the ratio $L/L_{max}$ lies in the interval $[1,n]$ when $f$ is a convex quadratic function and both parameters are set to their best values."

Many thanks again.

  • 549
  • 2
  • 9
  • Is this an answer or a question (in which case it should posted as a comment or an independent post in a new thread) ? – dohmatob Dec 23 '20 at 22:12