在我們要深度剖析AI,打開這個黑盒子時,我們需要先徹底了解其中的原理,才能更進一步的微調,這篇文章會使用numpy和smypy進行計算,接著使用matplotlib來進行資料視覺化,本次的模型和資料都會開放給大家使用, 下載連結
事不宜遲,馬上來看一下實做的部分
積分表與資料視覺化
下面我們使用jupyter notebook呈現
首先我們先加入所需的函式庫
In [1]:
!pip install pandas matplotlib numpy sympy
Requirement already satisfied: pandas in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (1.1.5) Requirement already satisfied: matplotlib in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (3.3.4) Requirement already satisfied: numpy in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (1.19.2) Requirement already satisfied: sympy in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (1.9) Requirement already satisfied: pytz>=2017.2 in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (from pandas) (2023.3.post1) Requirement already satisfied: python-dateutil>=2.7.3 in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (from pandas) (2.8.2) Requirement already satisfied: cycler>=0.10 in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (from matplotlib) (0.11.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (from matplotlib) (3.1.1) Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (from matplotlib) (1.3.1) Requirement already satisfied: pillow>=6.2.0 in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (from matplotlib) (8.4.0) Requirement already satisfied: mpmath>=0.19 in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (from sympy) (1.3.0) Requirement already satisfied: six>=1.5 in c:\users\admin\anaconda3\envs\pytorch\lib\site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
接著我們在檔案中有附上一個隨機亂數的csv檔,我們可以用這個資料進行分析做變異數練習。 首先我們先算出隨機亂數中的變異數(Variance)
Variance¶
設定一個變數rand_num,並用pandas讀取csv檔
In [2]:
import pandas as pd
rand_num = pd.read_csv('隨機亂數.csv')
用變異數公式算出變異數
In [3]:
mean = rand_num.mean()
sum = 0
for i in rand_num['28']:
sum += (i - mean) ** 2
sum /= 29
sum
Out[3]:
28 180.939358 dtype: float64
接下來我們以f(x) = x^3 – 4x^2 + 2 為例製作積分表,它的圖形可以用matplotlib繪製
Integral chart¶
In [4]:
# 匯入 matplotlib.pyplot 和 numpy 模組
import matplotlib.pyplot as plt
import numpy as np
# 定義函數 f(x) = x^3 - 4x^2 + 2
def f(x):
return x ** 3 - 4 * x ** 2 + 2
# 生成 x 值範圍
x = np.linspace(-6, 6, 1500)
# 繪製函數 f(x) 的曲線
plt.plot(x, f(x), label='f(x)')
# 加入水平虛線
plt.axhline(color='black', linestyle='--')
# 加入垂直虛線
plt.axvline(color='black', linestyle='--')
# 加入格線
plt.grid(color='gray', linewidth=1)
# 顯示圖例
plt.legend()
# 顯示圖形
plt.show()
接著就可以建立積分表囉
In [5]:
# 匯入 sympy 和 pandas 模組
import sympy as sy
import pandas as pd
# 建立一個空的 DataFrame,所有值初始化為 -1
df = pd.DataFrame(-1, index=[], columns=[])
# 定義函數 f(x) = x^3 - 4x^2 + 2
def f(x):
return x ** 3 - 4 * x ** 2 + 2
# 創建符號 x
x = sy.Symbol('x')
# 使用雙層迴圈計算積分結果並填入 DataFrame 中
for i in range(-5, 6):
for j in range(-5, 6):
# 如果起始值小於等於結束值,計算積分並填入 DataFrame
if i <= j:
df.loc[i, j] = sy.integrate(f(x), (x, i, j))
# 如果起始值大於結束值,填入 -1
else:
df.loc[i, j] = -1
# 回傳 DataFrame
df
Out[5]:
-5 | -4 | -3 | -2 | -1 | 0 | 1 | 2 | 3 | 4 | 5 | |
---|---|---|---|---|---|---|---|---|---|---|---|
-5 | 0 | -2059/12 | -788/3 | -1209/4 | -940/3 | -3755/12 | -312 | -3787/12 | -968/3 | -1305/4 | -940/3 |
-4 | -1 | 0 | -1093/12 | -392/3 | -567/4 | -424/3 | -1685/12 | -144 | -1813/12 | -464/3 | -567/4 |
-3 | -1 | -1 | 0 | -475/12 | -152/3 | -201/4 | -148/3 | -635/12 | -60 | -763/12 | -152/3 |
-2 | -1 | -1 | -1 | 0 | -133/12 | -32/3 | -39/4 | -40/3 | -245/12 | -24 | -133/12 |
-1 | -1 | -1 | -1 | -1 | 0 | 5/12 | 4/3 | -9/4 | -28/3 | -155/12 | 0 |
0 | -1 | -1 | -1 | -1 | -1 | 0 | 11/12 | -8/3 | -39/4 | -40/3 | -5/12 |
1 | -1 | -1 | -1 | -1 | -1 | -1 | 0 | -43/12 | -32/3 | -57/4 | -4/3 |
2 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | 0 | -85/12 | -32/3 | 9/4 |
3 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | 0 | -43/12 | 28/3 |
4 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | 0 | 155/12 |
5 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | 0 |
如你所見,這個函數從哪裡積分到哪裡一清二楚
In [ ]:
梯度下降學習率與動量法
In [1]:
!pip install pandas matplotlib numpy sympy
Requirement already satisfied: pandas in c:\users\admin\anaconda3\lib\site-packages (2.1.1) Requirement already satisfied: matplotlib in c:\users\admin\anaconda3\lib\site-packages (3.7.2) Requirement already satisfied: numpy in c:\users\admin\anaconda3\lib\site-packages (1.24.3) Requirement already satisfied: sympy in c:\users\admin\anaconda3\lib\site-packages (1.11.1) Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\admin\anaconda3\lib\site-packages (from pandas) (2.8.2) Requirement already satisfied: pytz>=2020.1 in c:\users\admin\anaconda3\lib\site-packages (from pandas) (2023.3.post1) Requirement already satisfied: tzdata>=2022.1 in c:\users\admin\anaconda3\lib\site-packages (from pandas) (2023.3) Requirement already satisfied: contourpy>=1.0.1 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib) (1.0.5) Requirement already satisfied: cycler>=0.10 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib) (0.11.0) Requirement already satisfied: fonttools>=4.22.0 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib) (4.25.0) Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib) (1.4.4) Requirement already satisfied: packaging>=20.0 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib) (23.1) Requirement already satisfied: pillow>=6.2.0 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib) (10.0.1) Requirement already satisfied: pyparsing<3.1,>=2.3.1 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib) (3.0.9) Requirement already satisfied: mpmath>=0.19 in c:\users\admin\anaconda3\lib\site-packages (from sympy) (1.3.0) Requirement already satisfied: six>=1.5 in c:\users\admin\anaconda3\lib\site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)
簡單梯度下降¶
下面是一個簡單return x ** 2的梯度下降,我們可以看到他並未下降到最低點
In [14]:
# 匯入 numpy 和 matplotlib.pyplot 模組
import numpy as np
import matplotlib.pyplot as plt
# 目標函數
def func(x):
return x ** 2
# 損失函數的導數
def dfunc(x):
return 2 * x
def GD(x_start, df, epochs, lr):
# 初始化一個陣列來存儲每次迭代後的 x 值
xs = np.zeros(epochs + 1)
# 設置起始點
x = x_start
# 將起始點加入 xs 陣列中
xs[0] = x
# 開始迭代
for i in range(epochs):
# 根據梯度下降的更新規則,更新 x 的值
x += -df(x) * lr
# 將更新後的 x 值加入 xs 陣列中
xs[i + 1] = x
# 回傳每次迭代後的 x 值
return xs
# 設定起始點、迭代次數和學習率
x_start = 2
epochs = 100
learning_rate = 0.01
# 執行梯度下降法
x = GD(x_start, dfunc, epochs, learning_rate)
print('final_x', x[-1], 'min_x', func(x[-1]), 'gradient', -dfunc(x[-1]))
# 繪製函數圖形和梯度下降路線
from numpy import arange
t = arange(-3, 3, 0.01)
plt.plot(t, func(t), c='b') # 繪製目標函數曲線
plt.plot(x, func(x), c='r', label='lr{}'.format(learning_rate)) # 繪製梯度下降路線
plt.scatter(x, func(x), c='r') # 標示梯度下降的點
plt.scatter(x[-1], func(x[-1]), c='g') # 標示最終結果點
plt.legend() # 顯示圖例
plt.show() # 顯示圖形
final_x 0.2652391117895063 min_x 0.07035178642288623 gradient -0.5304782235790126
一般來說,梯度應該接近於0,但這裡我們發現還沒到最低點,通常這種狀況我們可以改變學習率(移動步伐)和迭代次數(epochs),例如我們將學習率改為0.02,epochs改成200,結果如下
In [15]:
x_start = 2
epochs = 200
learning_rate = 0.02
x = GD(x_start, dfunc, epochs, learning_rate)
print('final_x', x[-1], 'min_x', func(x[-1]), 'gradient', -dfunc(x[-1]))
# 繪製函數圖形和梯度下降路線
from numpy import arange
t = arange(-3, 3, 0.01)
plt.plot(t, func(t), c='b') # 繪製目標函數曲線
plt.plot(x, func(x), c='r', label='lr{}'.format(learning_rate)) # 繪製梯度下降路線
plt.scatter(x, func(x), c='r') # 標示梯度下降的點
plt.scatter(x[-1], func(x[-1]), c='g') # 標示最終結果點
plt.legend() # 顯示圖例
plt.show() # 顯示圖形
final_x 0.0005692153505391552 min_x 3.240061152894133e-07 gradient -0.0011384307010783104
很明顯這次好多了,綠點幾乎已經在最低點了
動量法與鞍點¶
有時候,我們會遇到很醜的曲線,會讓我們卡在區域最低值,而不是整張圖的最低,這時候我們就要使用動量法讓它不會卡住,例如x^4 + x ^ 3 – 5 x ^ 2 + 2 x + 1這個圖形
In [18]:
# 匯入 numpy 和 matplotlib.pyplot 模組
import numpy as np
import matplotlib.pyplot as plt
# 目標函數
def func(x):
return x ** 4 + x ** 3 - 5 * x ** 2 + 2 * x + 1
# 損失函數的導數
def dfunc(x):
return 4 * x ** 3 + 3 ** x **2+ 10 * x + 2
def GD(x_start, df, epochs, lr):
# 初始化一個陣列來存儲每次迭代後的 x 值
xs = np.zeros(epochs + 1)
# 設置起始點
x = x_start
# 將起始點加入 xs 陣列中
xs[0] = x
# 開始迭代
for i in range(epochs):
# 根據梯度下降的更新規則,更新 x 的值
x += -df(x) * lr
# 將更新後的 x 值加入 xs 陣列中
xs[i + 1] = x
# 回傳每次迭代後的 x 值
return xs
# 設定起始點、迭代次數和學習率
x_start = 2
epochs = 100
learning_rate = 0.01
# 執行梯度下降法
x = GD(x_start, dfunc, epochs, learning_rate)
print('final_x', x[-1], 'min_x', func(x[-1]), 'gradient', -dfunc(x[-1]))
# 繪製函數圖形和梯度下降路線
from numpy import arange
t = arange(-3, 3, 0.01)
plt.plot(t, func(t), c='b') # 繪製目標函數曲線
plt.plot(x, func(x), c='r', label='lr{}'.format(learning_rate)) # 繪製梯度下降路線
plt.scatter(x, func(x), c='r') # 標示梯度下降的點
plt.scatter(x[-1], func(x[-1]), c='g') # 標示最終結果點
plt.legend() # 顯示圖例
plt.show() # 顯示圖形
final_x -0.2995869188150427 min_x -0.06676858954662879 gradient -0.00020440195081561363
如你所見,使用梯度法就是會卡在鞍點,它並不是最低點,所以我們改用梯度法+動量法
In [42]:
# 匯入 numpy 和 matplotlib.pyplot 模組
import numpy as np
import matplotlib.pyplot as plt
# 目標函數
def func(x):
return x ** 4 + x ** 3 - 5 * x ** 2 + 2 * x + 1
# 損失函數的導數
def dfunc(x):
return 4 * x ** 3 + 3 * x **2+ 10 * x + 2
def GD(x_start, df, epochs, lr, Momentum):
# 初始化一個陣列來存儲每次迭代後的 x 值
xs = np.zeros(epochs + 1)
# 設置起始點
x = x_start
# 將起始點加入 xs 陣列中
xs[0] = x
change = 0
# 開始迭代
for i in range(epochs):
new_change = -df(x) * lr + Momentum * change
# 根據梯度下降的更新規則,更新 x 的值
x += new_change
change = new_change
# 將更新後的 x 值加入 xs 陣列中
xs[i + 1] = x
# 回傳每次迭代後的 x 值
return xs
# 設定起始點、迭代次數和學習率
x_start = 2
epochs = 5
learning_rate = 0.01
Momentum = 0.9
# 執行梯度下降法
x = GD(x_start, dfunc, epochs, learning_rate, Momentum)
print('final_x', x[-1], 'min_x', func(x[-1]), 'gradient', -dfunc(x[-1]))
# 繪製函數圖形和梯度下降路線
from numpy import arange
t = arange(-3, 3, 0.01)
plt.plot(t, func(t), c='b') # 繪製目標函數曲線
plt.plot(x, func(x), c='r', label='lr{}'.format(learning_rate)) # 繪製梯度下降路線
plt.scatter(x, func(x), c='r') # 標示梯度下降的點
plt.scatter(x[-1], func(x[-1]), c='g') # 標示最終結果點
plt.legend() # 顯示圖例
plt.show() # 顯示圖形
final_x -1.778431413558708 min_x -13.992390145110157 gradient 28.795280886083667
可以看到結果很成功,它跨越了區域最小值
In [ ]:
本章的教學就到這裡,希望你有所收穫喔!