Machine Learning Programming Workshop

2.1E Linear Regression in Machine Learning

Prepared By: Cheong Shiu Hong (FTFNCE)

5) Linear Regression with Boston Housing Dataset

Import Boston Dataset from Sci-Kit Learn Library

import sklearn.datasets as datasets
import time # To Track Time

boston = datasets.load_boston()

Let's Check Out the Dataset

boston.keys()

dict_keys(['data', 'target', 'feature_names', 'DESCR', 'filename'])

print("Feature Names:\n", boston['feature_names'])

Feature Names:
 ['CRIM' 'ZN' 'INDUS' 'CHAS' 'NOX' 'RM' 'AGE' 'DIS' 'RAD' 'TAX' 'PTRATIO'
 'B' 'LSTAT']

Shape of X and Y

boston['data'].shape, boston['target'].shape

((506, 13), (506,))

Features

df = pd.DataFrame(boston['data'], columns=boston['feature_names'])
df.head()

Label

pd.Series(boston['target']).head()

0    24.0
1    21.6
2    34.7
3    33.4
4    36.2
dtype: float64

X = boston['data'][:450]
Y = boston['target'][:450]
X_val = boston['data'][450:]
Y_val = boston['target'][450:]

Define the Model

def model(theta, X):
    return np.dot(theta, X) # Vectorize the Calculations

Define the Training Algorithm

def train(x, y, learning_rate=3e-6, iterations=1, first=False):
    global theta, prev_theta
    prev_theta = theta
    
    X = np.vstack([np.ones(y.shape[0]), x])

    for _ in range(iterations):
        # Model
        pred = model(theta, X)

        # Calculations for Backpropagation
        error = pred - y
        cost = np.mean(error**2)
        dcost_dtheta = np.mean(X * error, 1) # Calculate Gradients
        theta = theta - (dcost_dtheta * learning_rate) # Gradient Descent
    
    return cost, dcost_dtheta

Instantiate Random Numbers 'Theta' to be Trained

Since there are 13 Features, We will need 14 Parameters (13 Features + 1 Y-Intercept) to be Trained

theta = np.random.randn(X.shape[1]+1)
print(theta.shape)

(14,)

Training The Parameters for 25 x 30000 Iterations

epochs = 25

total_time = time.time()
start = time.time()

for i in range(1, epochs+1):
    lr = 5e-6 if i <= 15 else 2e-6
    cost, dcost_dtheta = train(X.T, Y, learning_rate=lr, iterations=30000)
    print('Epoch {} - Cost: {:.3f}\nTime: {:.2f}s\n'.format(i, cost, time.time()-start))
    start = time.time()

print('Total Time Taken: {:.2f}s'.format(time.time()-total_time))

Epoch 1 - Cost: 46.683
Time: 1.37s

Epoch 2 - Cost: 41.318
Time: 1.39s

Epoch 3 - Cost: 38.236
Time: 1.38s

Epoch 4 - Cost: 35.830
Time: 1.40s

Epoch 5 - Cost: 33.889
Time: 1.40s

Epoch 6 - Cost: 32.318
Time: 1.42s

Epoch 7 - Cost: 31.044
Time: 1.39s

Epoch 8 - Cost: 30.010
Time: 1.36s

Epoch 9 - Cost: 29.172
Time: 1.36s

Epoch 10 - Cost: 28.491
Time: 1.35s

Epoch 11 - Cost: 27.939
Time: 1.37s

Epoch 12 - Cost: 27.490
Time: 1.37s

Epoch 13 - Cost: 27.126
Time: 1.37s

Epoch 14 - Cost: 26.829
Time: 1.36s

Epoch 15 - Cost: 26.589
Time: 1.35s

Epoch 16 - Cost: 26.505
Time: 1.35s

Epoch 17 - Cost: 26.429
Time: 1.35s

Epoch 18 - Cost: 26.358
Time: 1.36s

Epoch 19 - Cost: 26.293
Time: 1.37s

Epoch 20 - Cost: 26.234
Time: 1.36s

Epoch 21 - Cost: 26.179
Time: 1.36s

Epoch 22 - Cost: 26.128
Time: 1.37s

Epoch 23 - Cost: 26.081
Time: 1.37s

Epoch 24 - Cost: 26.038
Time: 1.37s

Epoch 25 - Cost: 25.999
Time: 1.37s

Total Time Taken: 34.29s

Evaluate the Performance of the Model

Xs = np.vstack([np.ones(Y.shape[0]), X.T])
modelpred = model(theta, Xs)
np.mean((modelpred-Y)**2) # Mean Squared Error

25.9987258139818

Xs = np.vstack([np.ones(Y_val.shape[0]), X_val.T])
modelpred = model(theta, Xs)
np.mean((modelpred-Y_val)**2) # Mean Squared Error

15.756312117683022

6) Linear Regression with Sci-Kit Learn

return to top

Sci-Kit Learn is a Powerful Python Library that has Many Built-In Machine Learning Algorithms

Import Sklearn's Linear Regression Object from sklearn.linear_model

from sklearn.linear_model import LinearRegression

Instantiate Linear Regression Object

model = LinearRegression()

Fit Model to Data

model.fit(X, Y)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
         normalize=False)

Evaluate Score of Fitted Model

model.score(X, Y)

0.741527173293562

model.score(X_val, Y_val)

0.37565166960619634

Evaluate MSE of Fitted Model

skpred = model.predict(X)
np.mean((skpred-Y)**2)

23.33864094108222

skpred = model.predict(X_val)
np.mean((skpred-Y_val)**2)

11.407003268828058

Bootstrap Aggregating (Bagging)

num_bags = 250
bag_size = 150

bags = []
for i in range(num_bags):
    idx = np.random.choice(np.arange(X.shape[0]), bag_size)
    bags.append([X[idx], Y[idx]])

models = []
for bag in bags:
    models.append(LinearRegression())
    models[-1].fit(bag[0], bag[1])

skpreds = []
for model in models:
    skpreds.append(model.predict(X))
avg_preds = np.array(skpreds).mean(0)
np.mean((avg_preds-Y)**2)

23.383261612426924

skpreds = []
for model in models:
    skpreds.append(model.predict(X_val))
avg_preds = np.array(skpreds).mean(0)
np.mean((avg_preds-Y_val)**2)

11.053799301528954

7) What other algorithms can we use?

return to top

Sci-kit Learn Documentation

Ridge Regression

from sklearn.linear_model import Ridge

ridge_model = Ridge()

ridge_model.fit(X, Y)

Ridge(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)

ridge_pred = ridge_model.predict(X)
np.mean((ridge_pred - Y)**2)

23.499611434235288

Lasso Regression

from sklearn.linear_model import Lasso

lasso_model = Lasso()

lasso_model.fit(X, Y)

Lasso(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=1000,
   normalize=False, positive=False, precompute=False, random_state=None,
   selection='cyclic', tol=0.0001, warm_start=False)

lasso_pred = lasso_model.predict(X) 
np.mean((lasso_pred - Y)**2)

27.85918052610639

Support Vector Machine

from sklearn.svm import SVR

SVM = SVR()

SVM.fit(X, Y)

C:\Users\cheon\Anaconda3\lib\site-packages\sklearn\svm\base.py:196: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning.
  "avoid this warning.", FutureWarning)

SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1,
  gamma='auto_deprecated', kernel='rbf', max_iter=-1, shrinking=True,
  tol=0.001, verbose=False)

SVM_pred = SVM.predict(X)
np.mean((SVM_pred - Y)**2)

76.78542653583092

Decision Trees

from sklearn.tree import DecisionTreeRegressor

dec_tree = DecisionTreeRegressor()

dec_tree.fit(X, Y)

DecisionTreeRegressor(criterion='mse', max_depth=None, max_features=None,
           max_leaf_nodes=None, min_impurity_decrease=0.0,
           min_impurity_split=None, min_samples_leaf=1,
           min_samples_split=2, min_weight_fraction_leaf=0.0,
           presort=False, random_state=None, splitter='best')

dec_tree_pred = dec_tree.predict(X)
np.mean((dec_tree_pred - Y)**2)

0.0

Ensemble Algorithms

from sklearn.ensemble import GradientBoostingRegressor

GBR = GradientBoostingRegressor()

GBR.fit(X, Y)

GradientBoostingRegressor(alpha=0.9, criterion='friedman_mse', init=None,
             learning_rate=0.1, loss='ls', max_depth=3, max_features=None,
             max_leaf_nodes=None, min_impurity_decrease=0.0,
             min_impurity_split=None, min_samples_leaf=1,
             min_samples_split=2, min_weight_fraction_leaf=0.0,
             n_estimators=100, n_iter_no_change=None, presort='auto',
             random_state=None, subsample=1.0, tol=0.0001,
             validation_fraction=0.1, verbose=0, warm_start=False)

GBR_pred = GBR.predict(X)
np.mean((GBR_pred - Y)**2)

1.8658266399537913

Previous: Next:

Linear Regression Logistic Regression

	CRIM	ZN	INDUS	NOX	RM	AGE	DIS	RAD	TAX	PTRATIO	B	LSTAT
0	0.00632	18.0	2.31	0.538	6.575	65.2	4.0900	1.0	296.0	15.3	396.90	4.98
1	0.02731	0.0	7.07	0.469	6.421	78.9	4.9671	2.0	242.0	17.8	396.90	9.14
2	0.02729	0.0	7.07	0.469	7.185	61.1	4.9671	2.0	242.0	17.8	392.83	4.03
3	0.03237	0.0	2.18	0.458	6.998	45.8	6.0622	3.0	222.0	18.7	394.63	2.94
4	0.06905	0.0	2.18	0.458	7.147	54.2	6.0622	3.0	222.0	18.7	396.90	5.33

Machine Learning Programming Workshop

2.1E Linear Regression in Machine Learning

Prepared By: Cheong Shiu Hong (FTFNCE)

Contents

5) Linear Regression with Boston Housing Dataset

Import Boston Dataset from Sci-Kit Learn Library

Let's Check Out the Dataset

Shape of X and Y

Features

Label

Define the Model

Define the Training Algorithm

Instantiate Random Numbers 'Theta' to be Trained

Since there are 13 Features, We will need 14 Parameters (13 Features + 1 Y-Intercept) to be Trained

Training The Parameters for 25 x 30000 Iterations

Evaluate the Performance of the Model

6) Linear Regression with Sci-Kit Learn

Sci-Kit Learn is a Powerful Python Library that has Many Built-In Machine Learning Algorithms

Import Sklearn's Linear Regression Object from sklearn.linear_model

Instantiate Linear Regression Object

Fit Model to Data

Evaluate Score of Fitted Model

Evaluate MSE of Fitted Model

Bootstrap Aggregating (Bagging)

7) What other algorithms can we use?

Sci-kit Learn Documentation

Ridge Regression

Lasso Regression

Support Vector Machine

Decision Trees

Ensemble Algorithms