カテゴリー
機械学習

Machine Learning [Week 4/11] Neural Networks: Representation


Machine Learning (Stanford University) の受講メモ

  • 内容
    • Neural network forward propagation
    • Multiple classification using logistic regression and the one-vs-all technique
  • メモ
    • この講座だと、\(x_0\) としてb(ias)定義し、\(\Theta\)のDimensionを求めている。
    • \(z=wx+b\)
    • \( w.shape:(n^{[l]},n^{[l-1]})\)
    • \( \Theta.shape:(s_{j+1},s_j+1)\): +1はbias

Motivations

Non-linear Hypotheses

Picture: feature N is too large.

Neurons and the Brain

Origins: Algorithms that try to mimic the brain. 80s-early 90s; popularity diminished in late 90s.
Recent resurgence: State-of-the-art technique for many applications.

  • The “one learning algorithm” hypothesis
    • Auditory cortex learns to see.
    • Somatosensory cortex learns to see.
  • Sensor representations in the brain
    • Seeing with your tongue.
    • Human echolocation (sonar).
    • Haptic belt: Direction sense.
    • Implanting a 3rd eye.
    • センサーは脳のたいていの場所に繋げて、扱い方学習する
  • コースではNNは人工知能のためではなく、機械学習の応用として扱う

Neural Networks

  • Neurons in the brain
  • Neuron model: Logistic unit
    • \(x_0=1\): bias unit
      • 便利だから
    • \(g(z)\): Sigmoid(logistic)activation function.
    • \(\theta\): weight
  • label
    • \[
      \begin{align*}& a_i^{(j)} = \text{“activation” of unit $i$ in layer $j$} \newline& \Theta^{(j)} = \text{matrix of weights controlling function mapping from layer $j$ to layer $j+1$}\end{align*}
      \]
  • one hidden layer
    • \[
      \begin{bmatrix}x_0 \newline x_1 \newline x_2 \newline x_3\end{bmatrix}\rightarrow\begin{bmatrix}a_1^{(2)} \newline a_2^{(2)} \newline a_3^{(2)} \newline \end{bmatrix}\rightarrow h_\theta(x)
      \]
  • The values for each the “activation” nodes
    • \[
      \begin{align*} a_1^{(2)} = g(\Theta_{10}^{(1)}x_0 + \Theta_{11}^{(1)}x_1 + \Theta_{12}^{(1)}x_2 + \Theta_{13}^{(1)}x_3) \newline a_2^{(2)} = g(\Theta_{20}^{(1)}x_0 + \Theta_{21}^{(1)}x_1 + \Theta_{22}^{(1)}x_2 + \Theta_{23}^{(1)}x_3) \newline a_3^{(2)} = g(\Theta_{30}^{(1)}x_0 + \Theta_{31}^{(1)}x_1 + \Theta_{32}^{(1)}x_2 + \Theta_{33}^{(1)}x_3) \newline h_\Theta(x) = a_1^{(3)} = g(\Theta_{10}^{(2)}a_0^{(2)} + \Theta_{11}^{(2)}a_1^{(2)} + \Theta_{12}^{(2)}a_2^{(2)} + \Theta_{13}^{(2)}a_3^{(2)}) \newline \end{align*}
      \]
  • The dimensions of these matrices of weights
    • layer \( j \) unit \( s_j \) to layer \( j+1 \) unit \( s_{j+1} \) \[
      \Theta^{(j)}\in\Bbb R^{( s_{j+1} \times (s_j + 1) )}
      \]
  • Forward propagation: Vectorized implementation
    • \[
      \begin{align*}a_1^{(2)} = g(z_1^{(2)}) \newline a_2^{(2)} = g(z_2^{(2)}) \newline a_3^{(2)} = g(z_3^{(2)}) \newline \end{align*}
      \]
    • layer j=2 and node k, the variable z will be: \[
      z_k^{(2)} = \Theta_{k,0}^{(1)}x_0 + \Theta_{k,1}^{(1)}x_1 + \cdots + \Theta_{k,n}^{(1)}x_n
      \]
    • Vector representation of x and z^j\[
      \begin{align*}x = \begin{bmatrix}x_0 \newline x_1 \newline\cdots \newline x_n\end{bmatrix} &z^{(j)} = \begin{bmatrix}z_1^{(j)} \newline z_2^{(j)} \newline\cdots \newline z_n^{(j)}\end{bmatrix}\end{align*}
      \]
    • \[
      z^{(j)} = \Theta^{(j-1)}a^{(j-1)}
      \]
    • \[
      a^{(j)} = g(z^{(j)})
      \]
    • \[
      z^{(j+1)} = \Theta^{(j)}a^{(j)}
      \]
    • \[
      h_\Theta(x) = a^{(j+1)} = g(z^{(j+1)})
      \]

Applications

Examples and Intuitions I

  • Function: x1 and x2 then true;
  • The graph of our functions will look like:\[
    \begin{align*}\begin{bmatrix}x_0 \newline x_1 \newline x_2\end{bmatrix} \rightarrow\begin{bmatrix}g(z^{(2)})\end{bmatrix} \rightarrow h_\Theta(x)\end{align*}
    \]
  • first matrix:\[
    \Theta^{(1)} =\begin{bmatrix}-30 & 20 & 20\end{bmatrix}
    \]
  • \[
    \begin{align*}& h_\Theta(x) = g(-30 + 20x_1 + 20x_2) \newline \newline & x_1 = 0 \ \ and \ \ x_2 = 0 \ \ then \ \ g(-30) \approx 0 \newline & x_1 = 0 \ \ and \ \ x_2 = 1 \ \ then \ \ g(-10) \approx 0 \newline & x_1 = 1 \ \ and \ \ x_2 = 0 \ \ then \ \ g(-10) \approx 0 \newline & x_1 = 1 \ \ and \ \ x_2 = 1 \ \ then \ \ g(10) \approx 1\end{align*}
    \]

Examples and Intuitions II

  • FunctionをAND,NOR,ORとして直感的な理解を

Multiclass Classification

  • We can define our set of resulting classes as y:\[
    y^{(i)}=\begin{bmatrix}1\\0\\0\\0\end{bmatrix}, \begin{bmatrix}0\\1\\0\\0\end{bmatrix}, \begin{bmatrix}0\\0\\1\\0\end{bmatrix}, \begin{bmatrix}0\\0\\0\\1\end{bmatrix}
    \]
  • Our final hypothesis function setup:\[
    \begin{bmatrix}x_0\\x_1\\x_2\\x_3\end{bmatrix}\rightarrow \begin{bmatrix}a_0^{[2]}\\ a_1^{[2]}\\a_2^{[2]}\\ a_3^{[2]}\end{bmatrix} \rightarrow \begin{bmatrix} a_0^{[3]}\\ a_1^{[3]} \\ a_2^{[3]}\\ a_3^{[3]}\end{bmatrix} \rightarrow\dots\rightarrow\begin{bmatrix}h_\Theta (x)_1\\ h_\Theta (x)_2 \\ h_\Theta (x)_3 \\ h_\Theta (x)_4 \end{bmatrix}
    \]
  • Our resulting hypothesis for one set of inputs may look like:\[
    h_\Theta(x) =\begin{bmatrix}0 \newline 0 \newline 1 \newline 0 \newline\end{bmatrix}
    \]

コメントを残す

メールアドレスが公開されることはありません。 * が付いている欄は必須項目です