当前位置：首页 > news >正文

期望、方差和协方差

news 2025/7/6 0:09:53

前言

本文隶属于专栏《机器学习数学通关指南》，该专栏为笔者原创，引用请注明来源，不足和错误之处请在评论区帮忙指出，谢谢！

本专栏目录结构和参考文献请见《机器学习数学通关指南》

正文

在这里插入图片描述

引言 🌟

在机器学习的数学基础中，概率统计扮演着核心角色，而期望、方差和协方差则是理解数据分布和模型行为的关键指标。本文将深入探讨这些概念及其在机器学习中的实际应用，帮助你更好地理解模型背后的数学原理。

一、数学期望 (Expectation) 📊

1.1 定义与本质 📝

数学期望是随机变量的"平均值"，它反映了数据的中心趋势。

离散型随机变量：
$\sum_{i} x_i P(X=x_i)$

连续型随机变量：
$\int_{-\infty}^{\infty} x f(x) dx$

其中 $f (x)$ 为概率密度函数。

1.2 核心性质 🔑

线性性： $E (a X + bY) = a E (X) + b E (Y)$ ，即使 $X$ 与 $Y$ 不独立
独立性：若 $X$ 与 $Y$ 独立，则 $E (X Y) = E (X) E (Y)$

1.3 在机器学习中的应用 🤖

损失函数优化：许多损失函数本质上是期望的形式，如均方误差(MSE)就是预测值与真实值差异的平方期望
模型评估指标：准确率、精确率、召回率等指标都可视为特定事件的期望值
梯度下降：批量梯度下降(BGD)计算的是所有样本梯度的期望，随机梯度下降(SGD)则是其无偏估计

# 期望计算示例 - 批量梯度下降与随机梯度下降
import numpy as np

# 批量梯度下降 - 计算所有样本的期望梯度
def batch_gradient_descent(X, y, theta, alpha, iterations):
    m = len(y)
    cost_history = np.zeros(iterations)
    
    for i in range(iterations):
        # 计算所有样本的期望梯度
        gradient = (1/m) * X.T.dot(X.dot(theta) - y)
        # 更新参数
        theta = theta - alpha * gradient
        # 计算代价
        cost_history[i] = (1/(2*m)) * np.sum(np.square(X.dot(theta) - y))
        
    return theta, cost_history

二、方差 (Variance) 📈

2.1 定义与本质 📝

方差衡量的是随机变量取值的分散程度，即数据点分布离中心点有多远：

$E\left[ (X - E(X))^2 \right] = E(X^2) - [E(X)]^2$

标准差是方差的平方根： $\sigma(X) = \sqrt{D(X)}$ ，它与原始数据具有相同的单位。

2.2 核心性质 🔑

平移不变性： $D (X + a) = D (X)$ ，其中 $a$ 为常数
尺度缩放： $D(aX) = a^2 D(X)$
加法规则：若 $X$ 与 $Y$ 独立，则 $D (X + Y) = D (X) + D (Y)$

2.3 在机器学习中的应用 🤖

特征选择：低方差特征通常信息量较少，可通过方差阈值进行过滤

# 使用方差阈值进行特征选择
from sklearn.feature_selection import VarianceThreshold

# 创建方差阈值选择器
selector = VarianceThreshold(threshold=0.1)  # 移除方差小于0.1的特征
X_selected = selector.fit_transform(X)

偏差-方差权衡：模型复杂度与泛化能力的平衡
- 低复杂度模型：低方差但高偏差(欠拟合)
- 高复杂度模型：低偏差但高方差(过拟合)
集成学习：通过组合多个模型降低方差(如随机森林减少决策树的方差)
早停法(Early Stopping)：防止训练过度导致方差增大

三、协方差 (Covariance) 🔄

3.1 定义与本质 📝

协方差描述了两个随机变量的联合变化趋势，衡量线性相关性：

$\text{Cov}(X, Y) = E\left[ (X - E(X))(Y - E(Y)) \right] = E(XY) - E(X)E(Y)$

协方差的符号意义：

$\text{Cov}(X, Y) > 0$ ： $X$ 与 $Y$ 同向变化
$\text{Cov}(X, Y) < 0$ ： $X$ 与 $Y$ 反向变化
$\text{Cov}(X, Y) = 0$ ： $X$ 与 $Y$ 线性不相关(但可能有非线性关系)

3.2 协方差的性质 🔑

对称性： $\text{Cov}(X, Y) = \text{Cov}(Y, X)$
线性性： $\text{Cov}(aX + b, cY + d) = ac \cdot \text{Cov}(X, Y)$
复合关系： $2\text{Cov}(X, Y)$

3.3 相关系数 (Correlation Coefficient) 📏

相关系数是标准化后的协方差，取值范围为 $[- 1, 1]$ ：

$\rho_{X,Y} = \frac{\text{Cov}(X, Y)}{\sqrt{D(X)} \sqrt{D(Y)}}$

特点：

$\rho_{X,Y} \in [-1, 1]$ ，消除量纲影响
$|\rho_{X,Y}| = 1$ 当且仅当 $X$ 与 $Y$ 存在严格线性关系

# 协方差与相关系数计算示例
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# 生成相关数据
np.random.seed(42)
n = 100
# 正相关数据
x1 = np.random.normal(0, 1, n)
y1 = 0.8 * x1 + np.random.normal(0, 0.5, n)
# 负相关数据
x2 = np.random.normal(0, 1, n)
y2 = -0.8 * x2 + np.random.normal(0, 0.5, n)
# 无相关数据
x3 = np.random.normal(0, 1, n)
y3 = np.random.normal(0, 1, n)

# 计算协方差和相关系数
cov_pos = np.cov(x1, y1)[0, 1]
corr_pos = np.corrcoef(x1, y1)[0, 1]
cov_neg = np.cov(x2, y2)[0, 1]
corr_neg = np.corrcoef(x2, y2)[0, 1]
cov_none = np.cov(x3, y3)[0, 1]
corr_none = np.corrcoef(x3, y3)[0, 1]

print(f"正相关 - 协方差: {cov_pos:.4f}, 相关系数: {corr_pos:.4f}")
print(f"负相关 - 协方差: {cov_neg:.4f}, 相关系数: {corr_neg:.4f}")
print(f"无相关 - 协方差: {cov_none:.4f}, 相关系数: {corr_none:.4f}")

3.4 在机器学习中的应用 🤖

特征工程：
- 识别冗余特征：高相关特征可能包含重复信息
- 多重共线性检测：线性模型中的问题检测
降维技术：如主成分分析(PCA)基于特征协方差矩阵
聚类分析：基于马氏距离(考虑特征间协方差)的聚类算法
投资组合优化：通过资产间协方差最小化风险

四、协方差矩阵 (Covariance Matrix) 📊

4.1 定义与构造 📝

对于多维随机变量 $\mathbf{X} = (X_1, X_2, \dots, X_n)^T$ ，协方差矩阵为：

$\mathbf{\Sigma} = \begin{bmatrix} \text{Var}(X_1) & \text{Cov}(X_1, X_2) & \cdots & \text{Cov}(X_1, X_n) \\ \text{Cov}(X_2, X_1) & \text{Var}(X_2) & \cdots & \text{Cov}(X_2, X_n) \\ \vdots & \vdots & \ddots & \vdots \\ \text{Cov}(X_n, X_1) & \text{Cov}(X_n, X_2) & \cdots & \text{Var}(X_n) \end{bmatrix}$

主对角线：各变量的方差 $\text{Var}(X_i)$
非对角线：变量间的协方差 $\text{Cov}(X_i, X_j)$

4.2 性质与特征 🔑

对称性： $\mathbf{\Sigma} = \mathbf{\Sigma}^T$
半正定性：对任意非零向量 $\mathbf{x}$ ，有 $\mathbf{x}^T\mathbf{\Sigma}\mathbf{x} \geq 0$
特征值：表示数据在特征向量方向上的方差大小

4.3 在机器学习中的应用 🤖

主成分分析 (PCA)：

# PCA与协方差矩阵示例
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
import numpy as np
import matplotlib.pyplot as plt

# 生成相关数据
np.random.seed(42)
n_samples = 300
# 创建具有相关性的二维数据
mean = [0, 0]
cov = [[2, 1.5], [1.5, 1]]  # 协方差矩阵: 特征间有正相关
X = np.random.multivariate_normal(mean, cov, n_samples)

# 标准化数据
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# 计算协方差矩阵
cov_matrix = np.cov(X_scaled.T)
print("数据协方差矩阵:")
print(cov_matrix)

# 进行PCA
pca = PCA()
X_pca = pca.fit_transform(X_scaled)

# 查看主成分解释的方差比例
print(f"各主成分解释的方差比例: {pca.explained_variance_ratio_}")

马哈拉诺比斯距离：考虑特征间相关性的距离度量

# 马哈拉诺比斯距离计算
from scipy.spatial.distance import mahalanobis
import numpy as np

# 数据
X = np.array([[1, 2], [3, 4], [5, 6], [2, 7]])
mean = np.mean(X, axis=0)
cov = np.cov(X.T)
inv_cov = np.linalg.inv(cov)

# 计算第一个点到均值的马氏距离
d = mahalanobis(X[0], mean, inv_cov)
print(f"马氏距离: {d}")

多元高斯分布建模：协方差矩阵决定概率密度函数的形状

# 可视化不同协方差矩阵下的二维高斯分布
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import multivariate_normal

# 创建网格点
x, y = np.mgrid[-3:3:.1, -3:3:.1]
pos = np.dstack((x, y))

# 定义三种不同的协方差矩阵
cov1 = [[1, 0], [0, 1]]  # 单位协方差矩阵（圆形）
cov2 = [[2, 0], [0, 0.5]]  # 对角协方差矩阵（椭圆形）
cov3 = [[1, 0.8], [0.8, 1]]  # 非对角协方差矩阵（旋转的椭圆）

# 计算概率密度
rv1 = multivariate_normal([0, 0], cov1)
rv2 = multivariate_normal([0, 0], cov2)
rv3 = multivariate_normal([0, 0], cov3)

# 绘制等高线
plt.figure(figsize=(15, 5))

plt.subplot(131)
plt.contourf(x, y, rv1.pdf(pos), cmap='viridis')
plt.title('单位协方差矩阵\n$\\Sigma = \\begin{bmatrix} 1 & 0 \\\\ 0 & 1 \\end{bmatrix}$')
plt.axis('equal')

plt.subplot(132)
plt.contourf(x, y, rv2.pdf(pos), cmap='viridis')
plt.title('对角协方差矩阵\n$\\Sigma = \\begin{bmatrix} 2 & 0 \\\\ 0 & 0.5 \\end{bmatrix}$')
plt.axis('equal')

plt.subplot(133)
plt.contourf(x, y, rv3.pdf(pos), cmap='viridis')
plt.title('非对角协方差矩阵\n$\\Sigma = \\begin{bmatrix} 1 & 0.8 \\\\ 0.8 & 1 \\end{bmatrix}$')
plt.axis('equal')

plt.tight_layout()

协方差矩阵在深度学习中的应用：
- 自适应优化器（如ADAM）使用梯度的二阶矩来调整学习率
- 协变量偏移检测（Covariate Shift）的重要指标

五、高级应用案例 🚀

5.1 期望最大化算法 (EM) 📈

EM算法是一种在存在隐变量的情况下估计模型参数的方法，广泛用于混合高斯模型等。

# 混合高斯模型(GMM)与EM算法示例
from sklearn.mixture import GaussianMixture
import numpy as np
import matplotlib.pyplot as plt

# 生成多峰数据
np.random.seed(42)
n_samples = 500

# 生成两个高斯分布的混合数据
X1 = np.random.normal(-2, 0.8, int(0.4 * n_samples))
X2 = np.random.normal(2, 1.2, int(0.6 * n_samples))
X = np.concatenate([X1, X2]).reshape(-1, 1)

# 使用EM算法拟合混合高斯模型
gmm = GaussianMixture(n_components=2, random_state=42)
gmm.fit(X)

# 获取拟合结果
means = gmm.means_.flatten()
variances = gmm.covariances_.flatten()
weights = gmm.weights_

print(f"组件均值: {means}")
print(f"组件方差: {variances}")
print(f"组件权重: {weights}")

# 可视化结果
x = np.linspace(-6, 6, 1000).reshape(-1, 1)
log_probs = gmm.score_samples(x)
probs = np.exp(log_probs)

# 绘制原始数据和拟合结果
plt.figure(figsize=(10, 6))
plt.hist(X, bins=30, density=True, alpha=0.5, label='数据直方图')
plt.plot(x, probs, 'r-', label='GMM密度估计')

# 绘制各分量
for i, (mean, var, w) in enumerate(zip(means, variances, weights)):
    plt.plot(x, w * norm.pdf(x, mean, np.sqrt(var)), 
             '--', label=f'组件 {i+1}')

plt.title('EM算法估计混合高斯模型')
plt.legend()
plt.grid(True)

5.2 协方差与正则化 🛡️

正则化技术往往与方差概念密切相关，如Ridge回归通过惩罚权重平方和控制模型方差。

# Ridge正则化控制方差示例
from sklearn.linear_model import Ridge, LinearRegression
from sklearn.preprocessing import PolynomialFeatures, StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import train_test_split
import numpy as np
import matplotlib.pyplot as plt

# 生成非线性数据
np.random.seed(42)
n_samples = 50
X = np.sort(np.random.uniform(0, 1, n_samples))
y = np.sin(2 * np.pi * X) + np.random.normal(0, 0.2, n_samples)
X = X.reshape(-1, 1)

# 分割训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# 创建多项式特征
degrees = [1, 4, 15]
alphas = [0, 0.001, 0.1, 10]  # 不同正则化强度

plt.figure(figsize=(16, 12))
for i, degree in enumerate(degrees):
    for j, alpha in enumerate(alphas):
        ax = plt.subplot(len(degrees), len(alphas), i * len(alphas) + j + 1)
        
        # 创建模型管道
        model = make_pipeline(
            PolynomialFeatures(degree=degree),
            StandardScaler(),
            Ridge(alpha=alpha)
        )
        
        # 拟合模型
        model.fit(X_train, y_train)
        
        # 预测并计算方差
        y_pred_train = model.predict(X_train)
        y_pred_test = model.predict(X_test)
        train_var = np.var(y_train - y_pred_train)
        test_var = np.var(y_test - y_pred_test)
        
        # 可视化
        X_plot = np.linspace(0, 1, 100).reshape(-1, 1)
        y_plot = model.predict(X_plot)
        
        plt.scatter(X_train, y_train, c='b', s=30, alpha=0.4, label='训练数据')
        plt.scatter(X_test, y_test, c='g', s=30, alpha=0.4, label='测试数据')
        plt.plot(X_plot, y_plot, 'r-', lw=2)
        plt.plot(X_plot, np.sin(2 * np.pi * X_plot), 'k--', lw=1, label='真实函数')
        
        plt.title(f"多项式阶数={degree}, alpha={alpha}\n"
                 f"训练集方差={train_var:.4f}, 测试集方差={test_var:.4f}")
        plt.ylim(-1.5, 1.5)
        
        if i == len(degrees) - 1 and j == 0:
            plt.legend()

plt.tight_layout()

5.3 协方差矩阵在异常检测中的应用 🔍

使用马氏距离和协方差矩阵进行多元异常检测。

# 使用马氏距离进行异常检测
from sklearn.covariance import EmpiricalCovariance, MinCovDet
from scipy.stats import chi2
import numpy as np
import matplotlib.pyplot as plt

# 生成数据（包含异常点）
np.random.seed(42)
n_samples = 200
n_outliers = 10

# 生成正常数据
X_normal = np.random.multivariate_normal(
    mean=[0, 0], 
    cov=[[1, 0.8], [0.8, 1]], 
    size=n_samples-n_outliers
)

# 生成异常点
X_outliers = np.random.uniform(-5, 5, (n_outliers, 2))
X_outliers = X_outliers * 3 + np.random.normal(0, 0.1, X_outliers.shape)

# 组合数据
X = np.vstack([X_normal, X_outliers])

# 使用稳健协方差估计
robust_cov = MinCovDet().fit(X)
mahal_robust = robust_cov.mahalanobis(X)

# 使用经验协方差估计
emp_cov = EmpiricalCovariance().fit(X)
mahal_emp = emp_cov.mahalanobis(X)

# 设置阈值（基于卡方分布）
threshold = chi2.ppf(0.975, df=2)  # 97.5% 置信区间，2维数据

# 可视化结果
plt.figure(figsize=(14, 6))

# 绘制稳健估计结果
plt.subplot(121)
plt.scatter(X[:, 0], X[:, 1], c=mahal_robust <= threshold, 
           cmap='coolwarm', s=40, edgecolors='k')
plt.colorbar(label='正常 vs 异常')
plt.title('稳健协方差估计 (MinCovDet)\n异常检测')
plt.grid(True)

# 绘制经验估计结果
plt.subplot(122)
plt.scatter(X[:, 0], X[:, 1], c=mahal_emp <= threshold, 
           cmap='coolwarm', s=40, edgecolors='k')
plt.colorbar(label='正常 vs 异常')
plt.title('经验协方差估计\n异常检测')
plt.grid(True)

plt.tight_layout()

六、实践案例：股票投资组合风险分析 💼📊

协方差和方差在金融领域的典型应用是投资组合优化，以下是一个简单的实践案例：

# 股票投资组合优化示例
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import minimize

# 模拟几支股票的历史收益率数据
np.random.seed(42)
n_days = 252  # 一年交易日
n_stocks = 4

# 生成相关的收益率数据
means = np.array([0.0001, 0.0002, 0.00015, 0.00025])  # 日均收益
cov = np.array([
    [0.0004, 0.0002, 0.0001, 0.0001],
    [0.0002, 0.0005, 0.0001, 0.0002],
    [0.0001, 0.0001, 0.0006, 0.0002],
    [0.0001, 0.0002, 0.0002, 0.0004]
])

# 生成股票收益率
returns = np.random.multivariate_normal(means, cov, n_days)
returns_df = pd.DataFrame(
    returns, 
    columns=['股票A', '股票B', '股票C', '股票D']
)

# 计算年化收益率与协方差
annual_returns = returns_df.mean() * 252
annual_cov = returns_df.cov() * 252

print("年化收益率:")
print(annual_returns)
print("\n年化协方差矩阵:")
print(annual_cov)

# 投资组合收益与风险计算函数
def portfolio_stats(weights, returns, cov):
    portfolio_return = np.sum(returns * weights)
    portfolio_risk = np.sqrt(np.dot(weights.T, np.dot(cov, weights)))
    sharpe_ratio = portfolio_return / portfolio_risk  # 夏普比率
    return portfolio_return, portfolio_risk, sharpe_ratio

# 生成随机投资组合
n_portfolios = 10000
results = np.zeros((3, n_portfolios))
weights_record = []

for i in range(n_portfolios):
    weights = np.random.random(n_stocks)
    weights /= np.sum(weights)
    weights_record.append(weights)
    
    portfolio_return, portfolio_risk, sharpe_ratio = portfolio_stats(
        weights, annual_returns, annual_cov
    )
    
    results[0, i] = portfolio_return
    results[1, i] = portfolio_risk
    results[2, i] = sharpe_ratio

# 可视化效率前沿
plt.figure(figsize=(12, 8))
plt.scatter(results[1, :], results[0, :], 
            c=results[2, :], cmap='viridis', 
            s=10, alpha=0.3)
plt.colorbar(label='夏普比率')
plt.xlabel('投资组合风险（标准差）')
plt.ylabel('投资组合预期收益')
plt.title('投资组合效率前沿')

# 寻找最优夏普比率的投资组合
def neg_sharpe(weights, returns, cov):
    return -portfolio_stats(weights, returns, cov)[2]

# 约束条件：权重之和为1
constraints = {'type': 'eq', 'fun': lambda x: np.sum(x) - 1}
# 边界条件：每个权重在0和1之间
bounds = tuple((0, 1) for _ in range(n_stocks))

# 优化计算
optimal_weights = minimize(
    neg_sharpe, 
    np.array([1/n_stocks] * n_stocks),  # 初始权重均等
    args=(annual_returns, annual_cov),
    method='SLSQP',
    bounds=bounds,
    constraints=constraints
).x

# 计算最优组合的收益和风险
optimal_return, optimal_risk, optimal_sharpe = portfolio_stats(
    optimal_weights, annual_returns, annual_cov
)

# 在图上标注最优组合点
plt.scatter(optimal_risk, optimal_return, 
            c='red', s=50, edgecolors='black',
            label=f'最优组合 (夏普比率: {optimal_sharpe:.2f})')
plt.legend()
plt.grid(True)

print("\n最优投资组合权重:")
for i, stock in enumerate(['股票A', '股票B', '股票C', '股票D']):
    print(f"{stock}: {optimal_weights[i]:.4f} ({optimal_weights[i]*100:.1f}%)")
print(f"预期年化收益: {optimal_return:.4f} ({optimal_return*100:.2f}%)")
print(f"组合风险(标准差): {optimal_risk:.4f} ({optimal_risk*100:.2f}%)")
print(f"夏普比率: {optimal_sharpe:.4f}")

七、偏差-方差分解 (Bias-Variance Decomposition) ⚖️

7.1 理论基础 📝

偏差-方差分解是机器学习中理解模型错误来源的重要工具，总体预期泛化误差可分解为：

$\hat{f}(x))^2] = (Bias[\hat{f}(x)])^2 + Var[\hat{f}(x)] + \sigma^2$

其中：

$Bias[\hat{f}(x)]$ ：模型预测的期望与真实值的偏差
$Var[\hat{f}(x)]$ ：模型预测的方差
$\sigma^2$ ：不可约误差（数据内在噪声）

7.2 实践演示 🔍

# 偏差-方差分解的可视化演示
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline

# 生成真实函数和带噪声的数据
np.random.seed(42)

def true_function(x):
    return np.sin(x * 2 * np.pi)

# 数据生成函数
def generate_data(n_samples=30):
    X = np.random.uniform(0, 1, n_samples)
    y = true_function(X) + np.random.normal(0, 0.1, n_samples)
    return X.reshape(-1, 1), y

# 训练模型并预测
def train_and_predict(X_train, y_train, X_test, degree):
    model = make_pipeline(
        PolynomialFeatures(degree=degree),
        LinearRegression()
    )
    model.fit(X_train, y_train)
    return model.predict(X_test)

# 多次采样求偏差和方差
def bias_variance_decomposition(degree, n_iterations=100):
    X_test = np.linspace(0, 1, 100).reshape(-1, 1)
    predictions = np.zeros((n_iterations, len(X_test)))
    
    for i in range(n_iterations):
        X_train, y_train = generate_data()
        y_pred = train_and_predict(X_train, y_train, X_test, degree)
        predictions[i, :] = y_pred
    
    # 计算平均预测、偏差和方差
    y_true = true_function(X_test)
    mean_prediction = predictions.mean(axis=0)
    bias = mean_prediction - y_true
    variance = predictions.var(axis=0)
    
    # 计算总误差、偏差平方和和方差
    mean_bias_sq = np.mean(bias ** 2)
    mean_var = np.mean(variance)
    
    return X_test, y_true, predictions, mean_prediction, mean_bias_sq, mean_var

# 测试不同复杂度的模型
degrees = [1, 3, 15]  # 线性、中等复杂度和高复杂度
plt.figure(figsize=(18, 12))

for i, degree in enumerate(degrees):
    X_test, y_true, predictions, mean_prediction, bias_sq, var = bias_variance_decomposition(degree)
    
    # 绘制结果
    plt.subplot(2, 3, i+1)
    # 绘制100次预测结果
    for j in range(min(100, predictions.shape[0])):
        plt.plot(X_test, predictions[j], 'r-', alpha=0.05)
    # 绘制平均预测
    plt.plot(X_test, mean_prediction, 'b-', lw=2, label='平均预测')
    # 绘制真实函数
    plt.plot(X_test, y_true, 'g--', lw=2, label='真实函数')
    plt.title(f"模型复杂度: {degree}次多项式")
    plt.xlabel('X')
    plt.ylabel('y')
    plt.legend()
    
    # 绘制偏差和方差
    plt.subplot(2, 3, i+4)
    plt.bar([0, 1], [bias_sq, var], width=0.4)
    plt.xticks([0, 1], ['偏差²', '方差'])
    plt.title(f"偏差-方差分解 (总误差: {bias_sq + var:.4f})")
    plt.ylim(0, max(0.2, 1.2 * max(bias_sq, var)))
    
plt.tight_layout()