Multiple Linear Regression

List of Tensorflow 2.0 Tutorials

We previously discussed the Simple Linear Regression Model. This time, we will look at Hypothesis, Cost Function, and Minimizing Cost of Multiple Linear Regression Model.

(Review) Simple Linear Regression Model

Hypothesis

\[H(x) = Wx + b\]

Cost Function

\[cost(W) = {1 \over 2m} {\sum_{i=1}^m} (Wx_i-y_i)^2\]

Gradient Descent Algorithm

\[W := W - \alpha{1 \over m} {\sum_{i=1}^m} (Wx_i-y_i) x_i\]

Multiple Linear Regression Model

Let’s take a look at the model with 3 features.

Hypothesis and Cost Function

\[H(x_1, x_2, x_3) = w_1x_1+w_2x_2+w_3x_3+b\] \[cost(W, b) = {1 \over m} {\sum_{i=1}^m} (H(x_{i1}, x_{i2}, x_{i3})-y_i)^2\]

Hypothesis Using Matrix

\[H(X)=XW= \begin{pmatrix} x_1 & x_2 & x_3 \end{pmatrix} \cdot \begin{pmatrix} w_1 \\ w_2 \\ w_3 \end{pmatrix} = \begin{pmatrix} x_1w_1 + x_2w_2 + x_3w_3 \end{pmatrix}\]

Predicting exam score

Suppose you have a quiz1, quiz2, or midterm score that predicts a final exam score. Five data are given as follows.

$x_1$(Quiz1)	$x_2$(Quiz2)	$x_3$(Midterm)	$y$(Final)
80	75	90	85
65	55	60	70
35	55	45	60
78	85	80	78
95	90	94	98

Hypothesis in this case

\[H(X) = XW\] \[\begin{pmatrix} x_{11} & x_{12} & x_{13} \\ x_{21} & x_{22} & x_{23} \\ x_{31} & x_{32} & x_{33} \\ x_{41} & x_{42} & x_{43} \\ x_{51} & x_{52} & x_{53} \end{pmatrix} \cdot \begin{pmatrix} w_1 \\ w_2 \\ w_3 \end{pmatrix} = \begin{pmatrix} x_{11}w_{1} + x_{12}w_{2} + x_{13}w_{3} \\ x_{21}w_{1} + x_{22}w_{2} + x_{23}w_{3} \\ x_{31}w_{1} + x_{32}w_{2} + x_{33}w_{3} \\ x_{41}w_{1} + x_{42}w_{2} + x_{43}w_{3} \\ x_{51}w_{1} + x_{52}w_{2} + x_{53}w_{3} \\ \end{pmatrix}\]

Libraries

import numpy as np
import tensorflow as tf
print("TensorFlow Version: %s" % tf.__version__)

TensorFlow Version: 2.0.0

A. Computing without Matrix operations

Problems can be solved without matrix operations. However, it is troublesome to assign variables individually as follows.

# Data
x1 = [80., 65., 35., 78., 95.]
x2 = [75., 55., 55., 85., 90.]
x3 = [90., 60., 45., 80., 94.]
y  = [85., 70., 60., 78., 98.]

# Weights
tf.random.set_seed(2020)
w1 = tf.Variable(tf.random.normal([1], mean=0.0))
w2 = tf.Variable(tf.random.normal([1], mean=0.0))
w3 = tf.Variable(tf.random.normal([1], mean=0.0))
b  = tf.Variable(tf.random.normal([1], mean=0.0))

# Learning Rate
learning_rate = 0.0000001

# Training
for i in range(2000+1):

    with tf.GradientTape() as tape:
        hypothesis = w1*x1 + w2*x2 + w3*x3 + b
        cost = tf.reduce_mean(tf.square(hypothesis - y))
        w1_grad, w2_grad, w3_grad, b_grad = tape.gradient(cost, [w1, w2, w3, b])

    w1.assign_sub(learning_rate * w1_grad)
    w2.assign_sub(learning_rate * w2_grad)
    w3.assign_sub(learning_rate * w3_grad)
    b.assign_sub(learning_rate * b_grad)

    if i % 200 == 0:
        print("#%s \t cost: %s" % (i, cost.numpy()))

#0 	 cost: 19095.684
#200 	 cost: 5163.9473
#400 	 cost: 1452.9823
#600 	 cost: 464.43408
#800 	 cost: 201.04555
#1000 	 cost: 130.81319
#1200 	 cost: 112.03182
#1400 	 cost: 106.954956
#1600 	 cost: 105.52873
#1800 	 cost: 105.075134
#2000 	 cost: 104.88092

B. Computing with matrix operations

Unlike the method introduced above, using matrix computation can solve the problem very concisely.

Predicting exam score

$x_1$(Quiz1)	$x_2$(Quiz2)	$x_3$(Midterm)	$y$(Final)
80	75	90	85
65	55	60	70
35	55	45	60
78	85	80	78
95	90	94	98

# Data
data = np.array([

    [80., 75., 90., 85.],
    [65., 55., 60., 70.],
    [35., 55., 45., 60.],
    [78., 85., 80., 78.],
    [95., 90., 94., 98.]
], dtype = np.float32)

# Slice Data
X = data[:, :-1]
Y = data[:, [-1]]

array([[80., 75., 90.],
       [65., 55., 60.],
       [35., 55., 45.],
       [78., 85., 80.],
       [95., 90., 94.]], dtype=float32)

array([[85.],
       [70.],
       [60.],
       [78.],
       [98.]], dtype=float32)

# Weights
tf.random.set_seed(2020)
W = tf.Variable(tf.random.normal([3, 1], mean=0.0))
b = tf.Variable(tf.random.normal([1], mean=0.0))

# Learning Rate
learning_rate = 0.0000001

# Hypothesis and Prediction Function
def predict(X):
    return tf.matmul(X, W) + b

# Training
for i in range(2000+1):

    with tf.GradientTape() as tape:
        cost = tf.reduce_mean(tf.square(predict(X) - Y))
        W_grad, b_grad = tape.gradient(cost, [W, b])

    W.assign_sub(learning_rate * W_grad)
    b.assign_sub(learning_rate * b_grad)

    if i % 200 == 0:
        print("#%s \t cost: %s" % (i, cost.numpy()))

#0 	 cost: 7769.522
#200 	 cost: 2113.2505
#400 	 cost: 606.6113
#600 	 cost: 205.28519
#800 	 cost: 98.37294
#1000 	 cost: 69.8826
#1200 	 cost: 62.28067
#1400 	 cost: 60.242634
#1600 	 cost: 59.686646
#1800 	 cost: 59.52544
#2000 	 cost: 59.46943

List of Tensorflow 2.0 Tutorials

Twitter Facebook LinkedIn

TensorFlow - 03.Multiple Linear Regression

Multiple Linear Regression

Multiple Linear Regression

List of Tensorflow 2.0 Tutorials

(Review) Simple Linear Regression Model

Hypothesis

Cost Function

Gradient Descent Algorithm

Multiple Linear Regression Model

Hypothesis and Cost Function

Hypothesis Using Matrix

Predicting exam score

Hypothesis in this case

Libraries

A. Computing without Matrix operations

B. Computing with matrix operations

Predicting exam score

List of Tensorflow 2.0 Tutorials

공유하기

댓글남기기

참고

Spark Kafka 설치 방법(Docker Compose)

Running LLM locally with GGUF files

GGUF 파일로 로컬에서 LLM 실행하기

LLM 모델 저장 형식 GGML, GGUF