Multiple Linear Regression

view jupyter notebook code

List of Tensorflow 2.0 Tutorials

We previously discussed the Simple Linear Regression Model. This time, we will look at Hypothesis, Cost Function, and Minimizing Cost of Multiple Linear Regression Model.

(Review) Simple Linear Regression Model

Hypothesis

\[H(x) = Wx + b\]

Cost Function

\[cost(W) = {1 \over 2m} {\sum_{i=1}^m} (Wx_i-y_i)^2\]

Gradient Descent Algorithm

\[W := W - \alpha{1 \over m} {\sum_{i=1}^m} (Wx_i-y_i) x_i\]

Multiple Linear Regression Model

Let’s take a look at the model with 3 features.

Hypothesis and Cost Function

\[H(x_1, x_2, x_3) = w_1x_1+w_2x_2+w_3x_3+b\] \[cost(W, b) = {1 \over m} {\sum_{i=1}^m} (H(x_{i1}, x_{i2}, x_{i3})-y_i)^2\]

Hypothesis Using Matrix

\[H(X)=XW= \begin{pmatrix} x_1 & x_2 & x_3 \end{pmatrix} \cdot \begin{pmatrix} w_1 \\ w_2 \\ w_3 \end{pmatrix} = \begin{pmatrix} x_1w_1 + x_2w_2 + x_3w_3 \end{pmatrix}\]

Predicting exam score

Suppose you have a quiz1, quiz2, or midterm score that predicts a final exam score. Five data are given as follows.

$x_1$(Quiz1) $x_2$(Quiz2) $x_3$(Midterm) $y$(Final)
80 75 90 85
65 55 60 70
35 55 45 60
78 85 80 78
95 90 94 98

Hypothesis in this case

\[H(X) = XW\] \[\begin{pmatrix} x_{11} & x_{12} & x_{13} \\ x_{21} & x_{22} & x_{23} \\ x_{31} & x_{32} & x_{33} \\ x_{41} & x_{42} & x_{43} \\ x_{51} & x_{52} & x_{53} \end{pmatrix} \cdot \begin{pmatrix} w_1 \\ w_2 \\ w_3 \end{pmatrix} = \begin{pmatrix} x_{11}w_{1} + x_{12}w_{2} + x_{13}w_{3} \\ x_{21}w_{1} + x_{22}w_{2} + x_{23}w_{3} \\ x_{31}w_{1} + x_{32}w_{2} + x_{33}w_{3} \\ x_{41}w_{1} + x_{42}w_{2} + x_{43}w_{3} \\ x_{51}w_{1} + x_{52}w_{2} + x_{53}w_{3} \\ \end{pmatrix}\]

Libraries

import numpy as np
import tensorflow as tf
print("TensorFlow Version: %s" % tf.__version__)
TensorFlow Version: 2.0.0

A. Computing without Matrix operations

Problems can be solved without matrix operations. However, it is troublesome to assign variables individually as follows.

# Data
x1 = [80., 65., 35., 78., 95.]
x2 = [75., 55., 55., 85., 90.]
x3 = [90., 60., 45., 80., 94.]
y  = [85., 70., 60., 78., 98.]

# Weights
tf.random.set_seed(2020)
w1 = tf.Variable(tf.random.normal([1], mean=0.0))
w2 = tf.Variable(tf.random.normal([1], mean=0.0))
w3 = tf.Variable(tf.random.normal([1], mean=0.0))
b  = tf.Variable(tf.random.normal([1], mean=0.0))

# Learning Rate
learning_rate = 0.0000001

# Training
for i in range(2000+1):

    with tf.GradientTape() as tape:
        hypothesis = w1*x1 + w2*x2 + w3*x3 + b
        cost = tf.reduce_mean(tf.square(hypothesis - y))
        w1_grad, w2_grad, w3_grad, b_grad = tape.gradient(cost, [w1, w2, w3, b])

    w1.assign_sub(learning_rate * w1_grad)
    w2.assign_sub(learning_rate * w2_grad)
    w3.assign_sub(learning_rate * w3_grad)
    b.assign_sub(learning_rate * b_grad)

    if i % 200 == 0:
        print("#%s \t cost: %s" % (i, cost.numpy()))
#0 	 cost: 19095.684
#200 	 cost: 5163.9473
#400 	 cost: 1452.9823
#600 	 cost: 464.43408
#800 	 cost: 201.04555
#1000 	 cost: 130.81319
#1200 	 cost: 112.03182
#1400 	 cost: 106.954956
#1600 	 cost: 105.52873
#1800 	 cost: 105.075134
#2000 	 cost: 104.88092

B. Computing with matrix operations

Unlike the method introduced above, using matrix computation can solve the problem very concisely.

Predicting exam score

$x_1$(Quiz1) $x_2$(Quiz2) $x_3$(Midterm) $y$(Final)
80 75 90 85
65 55 60 70
35 55 45 60
78 85 80 78
95 90 94 98
# Data
data = np.array([

    [80., 75., 90., 85.],
    [65., 55., 60., 70.],
    [35., 55., 45., 60.],
    [78., 85., 80., 78.],
    [95., 90., 94., 98.]
], dtype = np.float32)

# Slice Data
X = data[:, :-1]
Y = data[:, [-1]]
X
array([[80., 75., 90.],
       [65., 55., 60.],
       [35., 55., 45.],
       [78., 85., 80.],
       [95., 90., 94.]], dtype=float32)
Y
array([[85.],
       [70.],
       [60.],
       [78.],
       [98.]], dtype=float32)
# Weights
tf.random.set_seed(2020)
W = tf.Variable(tf.random.normal([3, 1], mean=0.0))
b = tf.Variable(tf.random.normal([1], mean=0.0))

# Learning Rate
learning_rate = 0.0000001

# Hypothesis and Prediction Function
def predict(X):
    return tf.matmul(X, W) + b

# Training
for i in range(2000+1):

    with tf.GradientTape() as tape:
        cost = tf.reduce_mean(tf.square(predict(X) - Y))
        W_grad, b_grad = tape.gradient(cost, [W, b])

    W.assign_sub(learning_rate * W_grad)
    b.assign_sub(learning_rate * b_grad)

    if i % 200 == 0:
        print("#%s \t cost: %s" % (i, cost.numpy()))
#0 	 cost: 7769.522
#200 	 cost: 2113.2505
#400 	 cost: 606.6113
#600 	 cost: 205.28519
#800 	 cost: 98.37294
#1000 	 cost: 69.8826
#1200 	 cost: 62.28067
#1400 	 cost: 60.242634
#1600 	 cost: 59.686646
#1800 	 cost: 59.52544
#2000 	 cost: 59.46943


List of Tensorflow 2.0 Tutorials

댓글남기기