カテゴリー
機械学習

Machine Learning [Week 01/11] Introduction

[mathjax]

Machine Learning (Stanford University) の受講メモ

Introduction

  • Supervised Learning
    • それぞれのサンプルに正解がある
  • Unsupervised Learning
    • クラスタリング、セグメント分け
    • カクテルパーティー問題
    • singular value decomposition

Linear Regression with One Variable

Model and Cost Function

  • Regression Problem
    • 予測が実数値
    • 不動産価格を予想
    • 写真から年齢を予想
  • Classification Problem
    • yが少数の離散値しか取れない
    • 不動産のカテゴライズ
    • 腫瘍が良性か悪性か
  • 表現
    • Training Set
    • Learning Algorithm
    • h: hypothesis(仮説)
      • 適切な名前じゃないかもしれないが習慣的に
      • predict
  • how to represent h?
    • \(h_\theta(x)=\theta_0+\theta_1x\)
    • Shorthand: \(h(x)\)
    • This is Linear regression
  • cost function(squared error function, mean squared error)
    • h(x) is close to y for training ex(x,y)
    • goal: minimize: \[
      J(\theta_0,\theta_1)=\frac{1}{2m}\sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})^2
      \]

Parameter Learning

  • gradient descent
    • Have some function: \(J(\theta_0,\theta_1)\)
    • Want min \(J(\theta_0,\theta_1)\)
    • Outline:
      • Start with some \(\theta_0,\theta_1\)
      • Keep changing to reduce \(J(\theta_0,\theta_1)\) until end up at a minimum.
  • gradient descent algorithm
    • repeat until convergence
      • \(\theta_j:=\theta_j-\alpha \frac{\partial}{\partial \theta_j}J(\theta_0,\theta_1)\)
      • learning rate: \(\alpha\)
      • simultaneous update: for j=0 and j=1
      • if alpha is small, gradient descent can be slow.
      • if alpha is too large, gradient descent can overshoot the minimum. may fail to converge収束, or even diverge発散.
      • derivativeはステップごとに小さくなるのでalphaを変更する必要はない
    • “Batch” Gradient Descent
      • バッチ勾配降下法
      • 教師データの全体を見るという意味でBatch
      • 線形回帰モデルを

Linear Algebra Review

Matrices and Vectors

Matrix Elements

\[
\begin{align}
A&=\begin{bmatrix}
1402 & 191\\
1371 & 821\\
949 & 1437\\
147 & 1448
\end{bmatrix}\\
A_{ij}&= i,j \text{ entry in the }i^{th}\text{ row, }j^{th} \text{column.}
\end{align}
\]

Vector: An n x 1 matrix.\[
\begin{align}
y&=\begin{bmatrix}
460\\
232\\
315\\
178
\end{bmatrix}\dots \text{4-dimensioned vector } \dots \mathbb{R}^4 \\
y_i&=i^{th}\text{ element }\\
y&=\begin{bmatrix}y_1\\ y_2\\ y_3\\ y_4\end{bmatrix}\dots \text{1-indexed} \dots y^{[1]}
\end{align}
\]

Addition and Scalar Multiplication

Matrix Addition

Scalar Multiplication

Combination of Operators

Matrix Vector Multiplication

\[
\begin{bmatrix}
1&3\\4&0\\2&1
\end{bmatrix}
\begin{bmatrix}
1\\5
\end{bmatrix}=
\begin{bmatrix}
16\\4\\7
\end{bmatrix}
\]

  • House size:
    • 2104
    • 1416
    • 1534
    • 852
  • \(h_\theta = -40+0.25x \)
  • \[
    \begin{bmatrix}
    1&2104\\
    1&1416\\
    1&1534\\
    1&852
    \end{bmatrix}\times
    \begin{bmatrix}
    -40\\
    0.25
    \end{bmatrix}=
    \begin{bmatrix}
    h_\theta(2014)\\
    h_\theta(1416)\\
    h_\theta(1534)\\
    h_\theta(852)
    \end{bmatrix}
    \]

Matrix Matrix Multiplication

  • A x B = C
  • (m,n) x (n,o) = (m,o)
  • House size:
    • 2104
    • 1416
    • 1534
    • 852
  • \(h_\theta^{[1]}(x) = -40+0.25x \)
  • \(h_\theta^{[2]}(x) = 200+0.1x \)
  • \(h_\theta^{[3]}(x) = -150+0.4x \)
  • \[
    \begin{bmatrix}
    1&2104\\
    1&1416\\
    1&1534\\
    1&852
    \end{bmatrix}\times
    \begin{bmatrix}
    -40 & 200 & -150\\
    0.25 & 0.1 & 0.4
    \end{bmatrix}=
    \begin{bmatrix}
    h_\theta^{[1]}(2014)&h_\theta^{[2]}(2014)&h_\theta^{[3]}(2014)\\
    h_\theta^{[1]}(1416)&h_\theta^{[2]}(1416)&h_\theta^{[3]}(1416)\\
    h_\theta^{[1]}(1534)&h_\theta^{[2]}(1534)&h_\theta^{[3]}(1534)\\
    h_\theta^{[1]}(852)&h_\theta^{[2]}(852)&h_\theta^{[3]}(852)
    \end{bmatrix}
    \]

Matrix Multiplication Properties

  • A x B ≠ B x A
  • A x (B x C)=(A x B) x C
  • Identity Matrix
    • Denoted \(I or I_{n\times n}\)
    • For any matrix A
    • \( A \cdot I = I \cdot A = A\)

Inverse and Transpose

\[
\begin{align}
1=\text{“identity”}\\
3(3^{-1})=3 \times \frac13 =1\\
0^{-1}: \text{undefined}
\end{align}
\]

Matrix inverse

If A is an m x m square matrix, and if it has an inverse
\[
\begin{align}
A A^{-1}=A^{-1}A=I
\end{align}
\]

Matrix Transpose

\[
\displaylines{
A_{m \times n} \text{ and }B=A^T\\
\text{then}\\
B_{n \times m} \text{ and }
B_{ij}=A_{ji}
}
\]

コメントを残す

メールアドレスが公開されることはありません。 が付いている欄は必須項目です