当前位置：首页 > news >正文

# 机器学习实操第二部分神经网络和深度学习第12章自定义模型和训练循环

news 来源：原创 2025/5/6 7:15:54

机器学习实操第二部分神经网络和深度学习第12章自定义模型和训练循环

内容概要

第12章深入探讨了如何使用TensorFlow的低级API来自定义模型和训练算法。本章首先介绍了TensorFlow的核心功能和架构，然后详细讲解了如何使用TensorFlow的低级API来创建自定义的损失函数、激活函数、初始化器、正则化器、权重约束、度量和层。此外，还介绍了如何使用TensorFlow的自动微分功能来计算梯度，并构建自定义的训练循环。通过这些内容，读者将能够更深入地理解TensorFlow的工作原理，并掌握如何在需要时自定义和优化模型和训练过程。
在这里插入图片描述

主要内容

TensorFlow概述
- 核心功能：TensorFlow是一个强大的数值计算库，特别适合大规模机器学习任务。它支持GPU加速、分布式计算、即时编译器（JIT）、计算图导出、自动微分和优化器等。
- 架构：TensorFlow的架构包括从低级API到高级API（如Keras）的多个层次，支持多种编程语言和设备。
使用TensorFlow进行数值计算
- 张量和操作：TensorFlow的API围绕张量构建，提供了丰富的张量操作，类似于NumPy。
- 变量：tf.Variable用于表示可变张量，适用于需要更新的模型参数。
- 数据结构：TensorFlow支持稀疏张量、张量数组、不规则张量、字符串张量、集合和队列等。
自定义模型和训练算法
- 自定义损失函数：创建自定义损失函数，如Huber损失，并处理模型保存和加载。
- 自定义激活函数、初始化器、正则化器和约束：编写自定义函数或继承相应类以实现自定义功能。
- 自定义度量：定义自定义度量，包括流式度量（stateful metrics）。
- 自定义层：创建无状态和有状态的自定义层，包括多输入多输出层。
- 自定义模型：通过继承tf.keras.Model类来构建复杂的模型架构。
自动微分和梯度计算
- 使用tf.GradientTape计算梯度：利用TensorFlow的自动微分功能计算梯度，支持反向传播。
- 控制梯度传播：使用tf.stop_gradient()阻止梯度传播，处理数值问题并自定义梯度计算。
自定义训练循环
- 构建训练循环：当fit()方法不够灵活时，编写自定义训练循环以实现特殊优化策略。
- 性能优化：使用tf.function和XLA（加速线性代数）提升代码性能。
TensorFlow函数和计算图
- tf.function：将Python函数转换为TensorFlow函数，提高执行效率。
- 自动图生成：TensorFlow通过AutoGraph和追踪生成计算图，优化执行流程。

关键代码和算法

12.1 自定义损失函数

def huber_fn(y_true, y_pred):error = y_true - y_predis_small_error = tf.abs(error) < 1squared_loss = tf.square(error) / 2linear_loss = tf.abs(error) - 0.5return tf.where(is_small_error, squared_loss, linear_loss)model.compile(loss=huber_fn, optimizer="nadam")

12.2 自定义层

class MyDense(tf.keras.layers.Layer):def __init__(self, units, activation=None, **kwargs):super().__init__(**kwargs)self.units = unitsself.activation = tf.keras.activations.get(activation)def build(self, batch_input_shape):self.kernel = self.add_weight(name="kernel",shape=[batch_input_shape[-1], self.units],initializer="glorot_normal")self.bias = self.add_weight(name="bias",shape=[self.units],initializer="zeros")def call(self, X):return self.activation(X @ self.kernel + self.bias)

12.3 自定义训练循环

for epoch in range(1, n_epochs + 1):print("Epoch {}/{}".format(epoch, n_epochs))for step in range(1, n_steps + 1):X_batch, y_batch = random_batch(X_train_scaled, y_train)with tf.GradientTape() as tape:y_pred = model(X_batch, training=True)main_loss = tf.reduce_mean(loss_fn(y_batch, y_pred))loss = tf.add_n([main_loss] + model.losses)gradients = tape.gradient(loss, model.trainable_variables)optimizer.apply_gradients(zip(gradients, model.trainable_variables))mean_loss(loss)for metric in metrics:metric(y_batch, y_pred)print_status_bar(step, n_steps, mean_loss, metrics)for metric in [mean_loss] + metrics:metric.reset_states()

精彩语录

中文：TensorFlow是一个强大的数值计算库，特别适合大规模机器学习任务。
英文原文：TensorFlow is a powerful library for numerical computation, particularly well suited and fine-tuned for large-scale machine learning.
解释：强调了TensorFlow的强大计算能力和适用场景。
中文：使用TensorFlow的低级API可以实现对模型和训练过程的精细控制。
英文原文：When you need more flexibility you will use the lower-level Python API, handling tensors directly.
解释：介绍了低级API的作用和使用场景。
中文：tf.GradientTape是TensorFlow中计算梯度的核心工具。
英文原文：To compute gradients automatically in TensorFlow, use tf.GradientTape.
解释：介绍了tf.GradientTape的基本功能。
中文：tf.function可以将Python函数转换为TensorFlow函数，提升执行效率。
英文原文：tf.function converts a Python function to a TensorFlow function, optimizing execution.
解释：解释了tf.function的作用和优势。
中文：自动图生成和XLA编译可以显著提升TensorFlow代码的性能。
英文原文：TensorFlow’s automatic graph generation and XLA compilation can significantly boost performance.
解释：强调了TensorFlow性能优化的关键技术。