機器學習動手做Lesson 28 — Deep Regression Ensembles 快速打造深度模型

施威銘研究室

5 min readMay 4, 2022

神經網路（Neural Network）的麻煩之處，在於很難訓練好一個模型。今天要介紹一篇論文，使用了集成式學習（Ensemble Learning）的方法，不但可以有神經網路非線性的能力，而且還可以有線性模型快速訓練的好處喔！

一、模型架構

圖一為 Deep Regression Ensembles 的架構示意圖，要訓練這樣的網路，只需要採取以下 8 個步驟即可：

步驟 1：設定一個常態分佈（Normal Distribution）的超參數，以及一個均勻分佈（Uniform Distribution）的超參數。

步驟 2：從步驟 1 的常態分佈中抽樣 D 個數字（假設上一層的資料，特徵有 D 個），組成權重（Weight）向量；從步驟 1 的均勻分布抽樣 1 個數字，得到偏值（Bias）。

步驟 3：將每筆資料，乘上權重向量，然後加上偏值，通過 ReLU 激活函數（Activation Function），得到一個數值 z。

步驟 4：步驟 2 到步驟 3 重複 P 次，得到一個長度為 P 的 z 向量。此時，原本的輸入資料為 n 列 D 行的矩陣，就會變成 n 列 P 行的矩陣。

步驟 5：步驟 1 到步驟 4 重複 K 次。

步驟 6：設定一個常規化（Regularisation）的超參數。

步驟 7：將每個 n 列 P 行的矩陣（總共有 K 個矩陣）視為特徵（Feature），用訓練資料的標籤（Label），以及步驟 6 的常規化參數，做 Ridge 迴歸，得到 K 個預測值。此步驟可以透過解矩陣就完成。

步驟 8：步驟 6 到步驟 7 重複 L 次，得到一個 n 列，L 乘 K 行的矩陣。此矩陣即為下一層的輸入。

我們可以發現在步驟 3 時，模型對資料做非線性的轉換。而在這 8 個步驟中，只有第 7 步需要訓練 Ridge 模型，而訓練 Ridge 模型是非常快速即可完成。因此，Deep Regression Ensembles 同時有非線性且快速訓練的好處。

二、Python 程式實作

首先，我們先載入函式庫以及資料集。

import pandas as pd
import numpy as np
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error as msedata = pd.read_csv('YearPredictionMSD.txt', 
                   sep=",", 
                   header=None)
x = data.iloc[:, 1:]
x = (x - x.mean()) / x.std()
y = data.iloc[:, 0]x_train = x.iloc[:-10000].to_numpy()
y_train = y.iloc[:-10000].to_numpy()
x_valid = x.iloc[-10000:].to_numpy()
y_valid = y.iloc[-10000:].to_numpy()

接下來我們要設定模型架構。圖一中的 P 為程式的 h_hidden，圖一中的 K 為程式的 n_ensemble，圖一中的 L 為程式的 n_regularisation。

n_train = len(x_train)
n_feature = len(x_train[0])
n_hidden = 90
n_ensemble = 10
n_regularisation = 10

如圖一，我們要產生 K 個 n 乘 P 的資料集。

k = []
for _ in range(n_ensemble):
    w = np.random.normal(loc = 0, 
                         scale = 1, 
                         size = (n_feature, n_hidden))
    b = np.random.uniform(low = -1, 
                          high = 1, 
                          size = n_hidden)
    y = x_train @ w + b
    z = y * (y > 0)
    k.append(np.array(z))

我們針對每一個 n 乘 P 的資料集，搭配每一個常規化參數，訓練一個 Ridge 模型。

regularisation = np.random.uniform(low = -1, 
                                   high = 1, 
                                   size = n_regularisation)
p = []for each_set in k:
    for each_alpha in regularisation:
        model = Ridge(alpha = each_alpha).fit(each_set, y_train)
        p.append(model.predict(each_set))

最後，對每一筆資料的預測值取平均，得到最後答案。

p = np.array(p).T
p = np.mean(p, axis = 1)

這樣就完成建模囉！

參考資料

Didisheim A., Kelly B., Malamud S. (2022). Deep Regression Ensembles.

關於作者

Chia-Hao Li received the M.S. degree in computer science from Durham University, United Kingdom. He engages in computer algorithm, machine learning, and hardware/software codesign. He was former senior engineer in Mediatek, Taiwan. His currently research topic is the application of machine learning techniques for fault detection in the high-performance computing systems.

機器學習動手做Lesson 28 — Deep Regression Ensembles 快速打造深度模型

一、模型架構

二、Python 程式實作

參考資料

關於作者

Written by 施威銘研究室

No responses yet