When you take the function
$$f(x, y) = 3x^2 + 3y^2 + 2xy$$
and start gradient descent at $x_0 = (6, 6)$ with learning rate $\eta = \frac{1}{2}$ it diverges.
Gradient descent
Gradient descent is an optimization rule which starts at a point $x_0$ and then applies the update rule
$$x_{k+1} = x_k + \eta d_k(x_k)$$
where $\eta$ is the step length (learning rate) and $d_k$ is the direction.
The direction is
$$d_k(x_k) = - \nabla f(x_k)$$
Example
$$\nabla f(x, y) = \begin{pmatrix}6x + 2y\6y + 2x\end{pmatrix}$$
\begin{align} x_0 &= (6, 6) & d_k(6, 6) &= (-24, -24)\ x_1 &= (-18, -18) & d_k(-18, -18) &= (72, 72\ x_2 &= (54, 54) & d_k(54, 54) &= (-216, -216)\ x_3 &= (-162, -162) & d_k(-162, -162) &= (648, 648) \end{align}
In general:
\begin{align} x_n &= (x_{n-1} - 8 \cdot \frac{1}{2} \cdot x_{n-1}, x_{n-1} - 8 \cdot \frac{1}{2} \cdot x_{n-1})\ x_n &= (-3x_{n-1}, -3x_{n-1}) \end{align}
You can clearly see that any learning rate $\eta > \frac{1}{8}$ will diverge. For this example, the learning rate $\eta = \frac{1}{8}$ would find the solution in one step and any $\eta < \frac{1}{8}$ will converge to the global optimum.