(六)机器学习之图卷积网络
图卷积网络(GCN)技术详解:从图信号处理到现代图神经网络
摘要
图卷积网络(Graph Convolutional Networks, GCN)是图神经网络的重要分支,专门用于处理图结构数据。本文深入解析GCN的核心原理、数学基础、网络架构以及从图信号处理到现代图神经网络的发展历程,帮助读者全面理解这一重要技术。
关键词: 图卷积网络、GCN、图神经网络、图信号处理、节点分类
文章目录
- 图卷积网络(GCN)技术详解:从图信号处理到现代图神经网络
- 摘要
- 1. 引言
- 1.1 GCN的发展历程
- 2. 图的基础概念
- 2.1 图的数学表示
- 2.2 图信号处理基础
- 3. GCN的数学原理
- 3.1 图卷积的数学定义
- 3.2 矩阵形式的GCN
- 4. 经典GCN架构
- 4.1 两层GCN
- 4.2 多层GCN
- 5. GCN的训练与优化
- 5.1 节点分类任务
- 5.2 图分类任务
- 6. GCN的变体与改进
- 6.1 GraphSAGE
- 6.2 GAT(图注意力网络)
- 7. 实际应用案例
- 7.1 社交网络分析
- 7.2 推荐系统
- 8. 相关论文与研究方向
- 8.1 经典论文
- 8.2 现代发展
- 参考文献
1. 引言
图卷积网络(GCN)是图神经网络(GNN)的重要变体,专门设计用于处理具有图结构的数据。与传统的卷积神经网络不同,GCN能够处理非欧几里得空间中的图数据,在社交网络分析、推荐系统、分子性质预测等领域取得了显著成果。
1.1 GCN的发展历程
- 2005年: 图信号处理的理论基础建立
- 2013年: 图上的卷积操作首次提出
- 2016年: GCN的经典论文发表
- 2017年至今: 各种GCN变体和改进
2. 图的基础概念
2.1 图的数学表示
图 G=(V,E)G = (V, E)G=(V,E) 由节点集合 VVV 和边集合 EEE 组成:
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import networkx as nx
import matplotlib.pyplot as pltclass Graph:def __init__(self, num_nodes, edges=None):self.num_nodes = num_nodesself.adjacency_matrix = torch.zeros(num_nodes, num_nodes)if edges is not None:for edge in edges:self.adjacency_matrix[edge[0], edge[1]] = 1self.adjacency_matrix[edge[1], edge[0]] = 1 # 无向图def add_edge(self, i, j):self.adjacency_matrix[i, j] = 1self.adjacency_matrix[j, i] = 1def get_degree_matrix(self):degrees = torch.sum(self.adjacency_matrix, dim=1)return torch.diag(degrees)def get_laplacian_matrix(self):D = self.get_degree_matrix()return D - self.adjacency_matrixdef get_normalized_laplacian(self):D = self.get_degree_matrix()D_inv_sqrt = torch.diag(1.0 / torch.sqrt(torch.diag(D) + 1e-8))return torch.eye(self.num_nodes) - D_inv_sqrt @ self.adjacency_matrix @ D_inv_sqrt# 创建示例图
edges = [(0, 1), (1, 2), (2, 3), (3, 0), (0, 2)]
graph = Graph(4, edges)print("邻接矩阵:")
print(graph.adjacency_matrix)
print("\n度矩阵:")
print(graph.get_degree_matrix())
print("\n拉普拉斯矩阵:")
print(graph.get_laplacian_matrix())
2.2 图信号处理基础
图信号是定义在图节点上的函数,图上的卷积操作基于图傅里叶变换:
class GraphSignalProcessor:def __init__(self, adjacency_matrix):self.adjacency_matrix = adjacency_matrixself.num_nodes = adjacency_matrix.shape[0]# 计算归一化拉普拉斯矩阵D = torch.diag(torch.sum(adjacency_matrix, dim=1))D_inv_sqrt = torch.diag(1.0 / torch.sqrt(torch.diag(D) + 1e-8))self.normalized_laplacian = torch.eye(self.num_nodes) - D_inv_sqrt @ adjacency_matrix @ D_inv_sqrt# 计算特征值和特征向量self.eigenvalues, self.eigenvectors = torch.linalg.eigh(self.normalized_laplacian)def graph_fourier_transform(self, signal):"""图傅里叶变换"""return self.eigenvectors.T @ signaldef inverse_graph_fourier_transform(self, transformed_signal):"""逆图傅里叶变换"""return self.eigenvectors @ transformed_signaldef graph_convolution(self, signal, filter_coeffs):"""图卷积操作"""# 在频域进行卷积transformed_signal = self.graph_fourier_transform(signal)filtered_signal = filter_coeffs * transformed_signalreturn self.inverse_graph_fourier_transform(filtered_signal)# 示例:图信号处理
adj_matrix = torch.tensor([[0, 1, 1, 0],[1, 0, 1, 1],[1, 1, 0, 1],[0, 1, 1, 0]
], dtype=torch.float32)processor = GraphSignalProcessor(adj_matrix)
signal = torch.tensor([1.0, 2.0, 3.0, 4.0])print("原始信号:", signal)
print("图傅里叶变换:", processor.graph_fourier_transform(signal))
3. GCN的数学原理
3.1 图卷积的数学定义
GCN的核心思想是在图上定义卷积操作。对于节点 iii,其卷积操作定义为:
hi(l+1)=σ(∑j∈N(i)1didjhj(l)W(l))h_i^{(l+1)} = \sigma\left(\sum_{j \in \mathcal{N}(i)} \frac{1}{\sqrt{d_i d_j}} h_j^{(l)} W^{(l)}\right)hi(l+1)=σ⎝⎛j∈N(i)∑didj1hj(l)W(l)⎠⎞
其中:
- hi(l)h_i^{(l)}hi(l) 是节点 iii 在第 lll 层的特征表示
- N(i)\mathcal{N}(i)N(i) 是节点 iii 的邻居集合
- did_idi 是节点 iii 的度
- W(l)W^{(l)}W(l) 是可学习的权重矩阵
- σ\sigmaσ 是激活函数
3.2 矩阵形式的GCN
GCN的矩阵形式可以表示为:
H(l+1)=σ(D~−12A~D~−12H(l)W(l))H^{(l+1)} = \sigma(\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{(l)} W^{(l)})H(l+1)=σ(D~−21A~D~−21H(l)W(l))
其中:
- A~=A+I\tilde{A} = A + IA~=A+I 是添加自环的邻接矩阵
- D~\tilde{D}D~ 是 A~\tilde{A}A~ 的度矩阵
- H(l)H^{(l)}H(l) 是第 lll 层的节点特征矩阵
class GraphConvolution(nn.Module):def __init__(self, in_features, out_features, bias=True):super().__init__()self.in_features = in_featuresself.out_features = out_featuresself.weight = nn.Parameter(torch.FloatTensor(in_features, out_features))if bias:self.bias = nn.Parameter(torch.FloatTensor(out_features))else:self.register_parameter('bias', None)self.reset_parameters()def reset_parameters(self):nn.init.xavier_uniform_(self.weight)if self.bias is not None:nn.init.zeros_(self.bias)def forward(self, x, adj):"""x: 节点特征矩阵 (N, in_features)adj: 归一化邻接矩阵 (N, N)"""# 线性变换support = torch.mm(x, self.weight)# 图卷积output = torch.spmm(adj, support)if self.bias is not None:output += self.biasreturn output# 归一化邻接矩阵的函数
def normalize_adjacency(adjacency_matrix):"""归一化邻接矩阵"""# 添加自环adj = adjacency_matrix + torch.eye(adjacency_matrix.shape[0])# 计算度矩阵degree = torch.sum(adj, dim=1)degree_inv_sqrt = torch.pow(degree, -0.5)degree_inv_sqrt[torch.isinf(degree_inv_sqrt)] = 0.0# 归一化degree_matrix_inv_sqrt = torch.diag(degree_inv_sqrt)normalized_adj = degree_matrix_inv_sqrt @ adj @ degree_matrix_inv_sqrtreturn normalized_adj# 示例:图卷积层
adj_matrix = torch.tensor([[0, 1, 1, 0],[1, 0, 1, 1],[1, 1, 0, 1],[0, 1, 1, 0]
], dtype=torch.float32)normalized_adj = normalize_adjacency(adj_matrix)
print("归一化邻接矩阵:")
print(normalized_adj)# 创建图卷积层
gc_layer = GraphConvolution(in_features=4, out_features=2)
x = torch.randn(4, 4) # 4个节点,每个节点4维特征output = gc_layer(x, normalized_adj)
print(f"\n输入特征形状: {x.shape}")
print(f"输出特征形状: {output.shape}")
4. 经典GCN架构
4.1 两层GCN
class GCN(nn.Module):def __init__(self, input_dim, hidden_dim, output_dim, dropout=0.5):super().__init__()self.gc1 = GraphConvolution(input_dim, hidden_dim)self.gc2 = GraphConvolution(hidden_dim, output_dim)self.dropout = nn.Dropout(dropout)self.relu = nn.ReLU()def forward(self, x, adj):# 第一层x = self.gc1(x, adj)x = self.relu(x)x = self.dropout(x)# 第二层x = self.gc2(x, adj)return x# 使用示例
gcn = GCN(input_dim=4, hidden_dim=16, output_dim=2)
x = torch.randn(4, 4) # 节点特征
adj = normalize_adjacency(adj_matrix) # 归一化邻接矩阵output = gcn(x, adj)
print(f"GCN输出形状: {output.shape}")
4.2 多层GCN
class MultiLayerGCN(nn.Module):def __init__(self, input_dim, hidden_dims, output_dim, dropout=0.5):super().__init__()self.layers = nn.ModuleList()# 输入层到第一个隐藏层self.layers.append(GraphConvolution(input_dim, hidden_dims[0]))# 隐藏层之间for i in range(len(hidden_dims) - 1):self.layers.append(GraphConvolution(hidden_dims[i], hidden_dims[i+1]))# 最后一个隐藏层到输出层self.layers.append(GraphConvolution(hidden_dims[-1], output_dim))self.dropout = nn.Dropout(dropout)self.relu = nn.ReLU()def forward(self, x, adj):for i, layer in enumerate(self.layers[:-1]):x = layer(x, adj)x = self.relu(x)x = self.dropout(x)# 输出层不使用激活函数x = self.layers[-1](x, adj)return x# 使用示例
multi_gcn = MultiLayerGCN(input_dim=4, hidden_dims=[32, 16, 8], output_dim=2)
output = multi_gcn(x, adj)
print(f"多层GCN输出形状: {output.shape}")
5. GCN的训练与优化
5.1 节点分类任务
class GCNClassifier(nn.Module):def __init__(self, input_dim, hidden_dim, num_classes, dropout=0.5):super().__init__()self.gcn = GCN(input_dim, hidden_dim, num_classes, dropout)def forward(self, x, adj):return self.gcn(x, adj)def train_model(self, x, adj, labels, train_mask, val_mask, epochs=200):optimizer = torch.optim.Adam(self.parameters(), lr=0.01, weight_decay=5e-4)criterion = nn.CrossEntropyLoss()best_val_acc = 0for epoch in range(epochs):self.train()optimizer.zero_grad()# 前向传播output = self.forward(x, adj)loss = criterion(output[train_mask], labels[train_mask])# 反向传播loss.backward()optimizer.step()# 验证if epoch % 10 == 0:self.eval()with torch.no_grad():val_output = self.forward(x, adj)val_pred = val_output[val_mask].argmax(dim=1)val_acc = (val_pred == labels[val_mask]).float().mean().item()if val_acc > best_val_acc:best_val_acc = val_accprint(f'Epoch {epoch}, Loss: {loss.item():.4f}, Val Acc: {val_acc:.4f}')return best_val_acc# 创建模拟数据
def create_synthetic_data(num_nodes=100, num_features=10, num_classes=3):# 生成随机图adj_matrix = torch.rand(num_nodes, num_nodes)adj_matrix = (adj_matrix + adj_matrix.T) / 2 # 对称化adj_matrix = (adj_matrix > 0.1).float() # 二值化adj_matrix.fill_diagonal_(0) # 移除自环# 生成节点特征x = torch.randn(num_nodes, num_features)# 生成标签(基于图结构)labels = torch.randint(0, num_classes, (num_nodes,))# 创建训练/验证/测试掩码train_mask = torch.zeros(num_nodes, dtype=torch.bool)val_mask = torch.zeros(num_nodes, dtype=torch.bool)test_mask = torch.zeros(num_nodes, dtype=torch.bool)# 随机分配indices = torch.randperm(num_nodes)train_mask[indices[:60]] = Trueval_mask[indices[60:80]] = Truetest_mask[indices[80:]] = Truereturn x, adj_matrix, labels, train_mask, val_mask, test_mask# 训练GCN分类器
x, adj, labels, train_mask, val_mask, test_mask = create_synthetic_data()
normalized_adj = normalize_adjacency(adj)classifier = GCNClassifier(input_dim=10, hidden_dim=16, num_classes=3)
best_val_acc = classifier.train_model(x, normalized_adj, labels, train_mask, val_mask)
print(f"最佳验证准确率: {best_val_acc:.4f}")
5.2 图分类任务
class GraphGCN(nn.Module):def __init__(self, input_dim, hidden_dim, num_classes, dropout=0.5):super().__init__()self.gcn1 = GraphConvolution(input_dim, hidden_dim)self.gcn2 = GraphConvolution(hidden_dim, hidden_dim)self.classifier = nn.Linear(hidden_dim, num_classes)self.dropout = nn.Dropout(dropout)self.relu = nn.ReLU()def forward(self, x, adj, batch_size):# 图卷积层x = self.gcn1(x, adj)x = self.relu(x)x = self.dropout(x)x = self.gcn2(x, adj)x = self.relu(x)x = self.dropout(x)# 全局平均池化x = torch.mean(x, dim=0, keepdim=True)x = x.repeat(batch_size, 1)# 分类x = self.classifier(x)return x# 图分类示例
def graph_classification_example():# 创建多个图的数据num_graphs = 50graphs_data = []for i in range(num_graphs):num_nodes = np.random.randint(10, 20)adj = torch.rand(num_nodes, num_nodes)adj = (adj + adj.T) / 2adj = (adj > 0.3).float()adj.fill_diagonal_(0)x = torch.randn(num_nodes, 5)label = torch.randint(0, 2, (1,))graphs_data.append((x, adj, label))# 训练图分类器model = GraphGCN(input_dim=5, hidden_dim=16, num_classes=2)optimizer = torch.optim.Adam(model.parameters(), lr=0.01)criterion = nn.CrossEntropyLoss()for epoch in range(100):total_loss = 0for x, adj, label in graphs_data:normalized_adj = normalize_adjacency(adj)output = model(x, normalized_adj, 1)loss = criterion(output, label)optimizer.zero_grad()loss.backward()optimizer.step()total_loss += loss.item()if epoch % 20 == 0:print(f'Epoch {epoch}, Loss: {total_loss/len(graphs_data):.4f}')graph_classification_example()
6. GCN的变体与改进
6.1 GraphSAGE
GraphSAGE使用采样和聚合策略:
class GraphSAGE(nn.Module):def __init__(self, input_dim, hidden_dim, output_dim, num_layers=2):super().__init__()self.num_layers = num_layersself.layers = nn.ModuleList()# 第一层self.layers.append(nn.Linear(input_dim, hidden_dim))# 中间层for _ in range(num_layers - 2):self.layers.append(nn.Linear(hidden_dim, hidden_dim))# 输出层self.layers.append(nn.Linear(hidden_dim, output_dim))self.relu = nn.ReLU()self.dropout = nn.Dropout(0.5)def forward(self, x, adj):for i, layer in enumerate(self.layers[:-1]):# 聚合邻居信息neighbor_info = torch.spmm(adj, x)# 结合自身信息x = layer(x + neighbor_info)x = self.relu(x)x = self.dropout(x)# 输出层x = self.layers[-1](x)return x# 使用GraphSAGE
sage = GraphSAGE(input_dim=4, hidden_dim=16, output_dim=2)
output = sage(x, adj)
print(f"GraphSAGE输出形状: {output.shape}")
6.2 GAT(图注意力网络)
class GraphAttentionLayer(nn.Module):def __init__(self, in_features, out_features, dropout=0.6, alpha=0.2):super().__init__()self.in_features = in_featuresself.out_features = out_featuresself.dropout = dropoutself.alpha = alphaself.W = nn.Linear(in_features, out_features, bias=False)self.a = nn.Linear(2 * out_features, 1, bias=False)self.leakyrelu = nn.LeakyReLU(self.alpha)self.dropout_layer = nn.Dropout(dropout)def forward(self, x, adj):h = self.W(x) # (N, out_features)N = h.size(0)# 计算注意力系数a_input = self._prepare_attentional_mechanism_input(h)e = self.leakyrelu(self.a(a_input).squeeze(2)) # (N, N)# 应用邻接矩阵掩码zero_vec = -9e15 * torch.ones_like(e)attention = torch.where(adj > 0, e, zero_vec)attention = F.softmax(attention, dim=1)attention = self.dropout_layer(attention)# 应用注意力权重h_prime = torch.matmul(attention, h)return h_primedef _prepare_attentional_mechanism_input(self, h):N = h.size(0)h_repeated_in_chunks = h.repeat_interleave(N, dim=0)h_repeated_alternating = h.repeat(N, 1)all_combinations_matrix = torch.cat([h_repeated_in_chunks, h_repeated_alternating], dim=1)return all_combinations_matrix.view(N, N, 2 * self.out_features)class GAT(nn.Module):def __init__(self, input_dim, hidden_dim, output_dim, num_heads=8, dropout=0.6):super().__init__()self.dropout = dropoutself.attention_layers = nn.ModuleList([GraphAttentionLayer(input_dim, hidden_dim, dropout=dropout)for _ in range(num_heads)])self.out_attention = GraphAttentionLayer(hidden_dim * num_heads, output_dim, dropout=dropout)def forward(self, x, adj):# 多头注意力x = torch.cat([att(x, adj) for att in self.attention_layers], dim=1)x = F.dropout(x, self.dropout, training=self.training)# 输出层x = self.out_attention(x, adj)return x# 使用GAT
gat = GAT(input_dim=4, hidden_dim=8, output_dim=2, num_heads=4)
output = gat(x, adj)
print(f"GAT输出形状: {output.shape}")
7. 实际应用案例
7.1 社交网络分析
class SocialNetworkGCN(nn.Module):def __init__(self, input_dim, hidden_dim, output_dim):super().__init__()self.gcn1 = GraphConvolution(input_dim, hidden_dim)self.gcn2 = GraphConvolution(hidden_dim, hidden_dim)self.gcn3 = GraphConvolution(hidden_dim, output_dim)self.relu = nn.ReLU()self.dropout = nn.Dropout(0.5)def forward(self, x, adj):x = self.gcn1(x, adj)x = self.relu(x)x = self.dropout(x)x = self.gcn2(x, adj)x = self.relu(x)x = self.dropout(x)x = self.gcn3(x, adj)return x# 社交网络节点分类示例
def social_network_example():# 模拟社交网络数据num_users = 100num_features = 20 # 用户特征(年龄、兴趣等)# 生成用户特征user_features = torch.randn(num_users, num_features)# 生成社交关系(基于特征相似性)similarity_matrix = torch.mm(user_features, user_features.T)adj_matrix = (similarity_matrix > 0.5).float()adj_matrix.fill_diagonal_(0)# 生成用户标签(基于特征聚类)from sklearn.cluster import KMeanskmeans = KMeans(n_clusters=3, random_state=42)labels = torch.tensor(kmeans.fit_predict(user_features.numpy()))# 创建训练/测试掩码train_mask = torch.zeros(num_users, dtype=torch.bool)test_mask = torch.zeros(num_users, dtype=torch.bool)indices = torch.randperm(num_users)train_mask[indices[:70]] = Truetest_mask[indices[70:]] = True# 训练模型model = SocialNetworkGCN(input_dim=num_features, hidden_dim=32, output_dim=3)optimizer = torch.optim.Adam(model.parameters(), lr=0.01)criterion = nn.CrossEntropyLoss()normalized_adj = normalize_adjacency(adj_matrix)for epoch in range(200):model.train()optimizer.zero_grad()output = model(user_features, normalized_adj)loss = criterion(output[train_mask], labels[train_mask])loss.backward()optimizer.step()if epoch % 50 == 0:model.eval()with torch.no_grad():test_output = model(user_features, normalized_adj)test_pred = test_output[test_mask].argmax(dim=1)test_acc = (test_pred == labels[test_mask]).float().mean().item()print(f'Epoch {epoch}, Loss: {loss.item():.4f}, Test Acc: {test_acc:.4f}')social_network_example()
7.2 推荐系统
class RecommendationGCN(nn.Module):def __init__(self, num_users, num_items, embedding_dim, hidden_dim):super().__init__()self.user_embedding = nn.Embedding(num_users, embedding_dim)self.item_embedding = nn.Embedding(num_items, embedding_dim)self.gcn1 = GraphConvolution(embedding_dim, hidden_dim)self.gcn2 = GraphConvolution(hidden_dim, embedding_dim)self.relu = nn.ReLU()self.dropout = nn.Dropout(0.3)def forward(self, user_item_adj):# 获取用户和物品嵌入user_emb = self.user_embedding.weightitem_emb = self.item_embedding.weightx = torch.cat([user_emb, item_emb], dim=0)# 图卷积x = self.gcn1(x, user_item_adj)x = self.relu(x)x = self.dropout(x)x = self.gcn2(x, user_item_adj)# 分离用户和物品嵌入user_emb_final = x[:user_emb.size(0)]item_emb_final = x[user_emb.size(0):]return user_emb_final, item_emb_final# 推荐系统示例
def recommendation_example():num_users = 50num_items = 100embedding_dim = 32hidden_dim = 64# 生成用户-物品交互矩阵interaction_matrix = torch.rand(num_users, num_items)interaction_matrix = (interaction_matrix > 0.7).float() # 二值化# 构建用户-物品二分图user_item_adj = torch.zeros(num_users + num_items, num_users + num_items)user_item_adj[:num_users, num_users:] = interaction_matrixuser_item_adj[num_users:, :num_users] = interaction_matrix.T# 归一化normalized_adj = normalize_adjacency(user_item_adj)# 创建模型model = RecommendationGCN(num_users, num_items, embedding_dim, hidden_dim)optimizer = torch.optim.Adam(model.parameters(), lr=0.01)# 训练for epoch in range(100):model.train()optimizer.zero_grad()user_emb, item_emb = model(normalized_adj)# 计算预测评分predicted_ratings = torch.mm(user_emb, item_emb.T)# 计算损失(只考虑有交互的项)loss = F.mse_loss(predicted_ratings * interaction_matrix, interaction_matrix)loss.backward()optimizer.step()if epoch % 20 == 0:print(f'Epoch {epoch}, Loss: {loss.item():.4f}')recommendation_example()
8. 相关论文与研究方向
8.1 经典论文
-
"Semi-Supervised Classification with Graph Convolutional Networks" (2016) - Kipf & Welling
- GCN的经典论文
- 提出了图卷积的简化形式
-
"Inductive Representation Learning on Large Graphs" (2017) - Hamilton et al.
- GraphSAGE的论文
- 提出了归纳式图学习
-
"Graph Attention Networks" (2018) - Veličković et al.
- GAT的论文
- 引入了注意力机制到图神经网络
8.2 现代发展
-
"How Powerful are Graph Neural Networks?" (2019) - Xu et al.
- 分析了GNN的表达能力
- 提出了GIN(Graph Isomorphism Network)
-
"Graph Neural Networks: A Review of Methods and Applications" (2020) - Wu et al.
- GNN的全面综述
- 总结了各种GNN变体
-
"Graph Transformer Networks" (2019) - Dwivedi & Bresson
- 将Transformer应用到图数据
- 开启了图Transformer的研究
参考文献
-
Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
-
Hamilton, W., Ying, Z., & Leskovec, J. (2017). Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 1024-1034.
-
Veličković, P., et al. (2018). Graph attention networks. International conference on learning representations.
-
Xu, K., et al. (2019). How powerful are graph neural networks?. International conference on learning representations.
-
Wu, Z., et al. (2020). Graph neural networks: A review of methods and applications. AI open, 1, 57-81.