[ 机器学习 - 吴恩达 ] Linear regression with one variable | 2-5 Gradient descent intuition

repeat until convergence {
\(\theta_j := \theta_j - \alpha\frac{\partial}{\partial \theta_j}J(\theta_0,\theta_1)\)??\((for\ j = 0\ and\ j = 1\))
}
\(\alpha\): learning rate

If \(\alpha\) is too small, gradient descent can be slow.

If \(\alpha\) is too large, gradient descent can overshoot (越过) the minimum. It may fail to converge, or even diverge.

Gradient descent can converge to a local minimum, even with the learning rate \(\alpha\) fixed (固定的).
这是因为：
As we approach a local minimum, gradient descent will automatically take smaller steps. So, no need to decrease \(\alpha\) over time.

机器学习-吴恩达学习笔记

[ 机器学习 - 吴恩达 ] Linear regression with one variable | 2-5 Gradient descent intuition

相关

Git--学习笔记（三）远程仓库克隆到本地

[学习笔记] 整数规划之割平面法 How and why?

计网学习笔记(5) - 物理层与媒介

selenium python学习笔记之八窗口截图、验证码处理

AntDesign vue学习笔记（三）嵌套路由使用

vue学习笔记（一）--来源尚硅谷

【学习笔记】TSlib校准原理

Go语言核心36讲（Go语言实战与应用六）--学习笔记

docker 学习笔记-7

SpringMVC学习笔记

Spring 学习笔记

[机器学习笔记(一)] TensorFLow安装

标签