当前位置：首页 > news >正文

Rust 练习册：构建自然语言数学计算器

news 2025/11/10 8:59:05

在人工智能和自然语言处理快速发展的今天，让计算机理解人类语言并执行相应操作已成为一个重要课题。今天我们要探讨的是一个有趣而具有挑战性的问题——Wordy，它要求我们构建一个能够解析自然语言数学问题并计算结果的程序。这不仅是对编程技能的考验，也是对语言理解和解析技术的实践。

问题背景

Wordy问题源自一个有趣的想法：如果我们能用自然语言向计算机提问数学问题，计算机是否能够理解并给出正确答案？这个问题在现实世界中有许多应用：

智能助手：像Siri、Alexa这样的语音助手需要理解用户的自然语言指令
教育软件：数学学习应用可以接受自然语言形式的问题
聊天机器人：客服系统需要理解并处理用户的查询
编程语言设计：一些语言追求更接近自然语言的语法

通过这个练习，我们将学习如何构建一个简单的自然语言解析器，它能够识别数学运算并计算结果。

问题描述

我们的任务是实现这样一个函数：

pub struct WordProblem;pub fn answer(command: &str) -> Option<i32> {unimplemented!("Return the result of the command '{}' or None, if the command is invalid.",command);
}

该函数接收一个字符串形式的自然语言数学问题，如果问题有效则返回计算结果，否则返回None。

根据测试案例，我们需要支持以下功能：

基本的四则运算：加法、减法、乘法、除法
负数处理
连续运算（如：1 plus 2 plus 3）
运算符优先级（乘除优先于加减）
错误处理：无效问题、语法错误等
可选支持指数运算

解决方案

让我们实现一个完整的Wordy问题解决方案：

pub fn answer(command: &str) -> Option<i32> {// 检查是否以"What is"开头并以"?"结尾if !command.starts_with("What is ") || !command.ends_with('?') {return None;}// 提取问题主体部分let question = &command[8..command.len() - 1];// 分词let tokens: Vec<&str> = question.split_whitespace().collect();if tokens.is_empty() {return None;}// 解析表达式parse_expression(&tokens)
}fn parse_expression(tokens: &[&str]) -> Option<i32> {let mut index = 0;let mut result = parse_number(tokens, &mut index)?;while index < tokens.len() {let operator = parse_operator(tokens, &mut index)?;let operand = parse_number(tokens, &mut index)?;result = match operator {"+" => result.checked_add(operand)?,"-" => result.checked_sub(operand)?,"*" => result.checked_mul(operand)?,"/" => result.checked_div(operand)?,_ => return None,};}Some(result)
}fn parse_number(tokens: &[&str], index: &mut usize) -> Option<i32> {if *index >= tokens.len() {return None;}let token = tokens[*index];*index += 1;// 处理负数if token.starts_with("-") && token.len() > 1 {token.parse::<i32>().ok()} else {token.parse::<i32>().ok()}
}fn parse_operator(tokens: &[&str], index: &mut usize) -> Option<&str> {if *index >= tokens.len() {return None;}let token = tokens[*index];*index += 1;match token {"plus" => Some("+"),"minus" => Some("-"),"multiplied" => {if *index < tokens.len() && tokens[*index] == "by" {*index += 1;Some("*")} else {None}}"divided" => {if *index < tokens.len() && tokens[*index] == "by" {*index += 1;Some("/")} else {None}}_ => None,}
}

支持指数运算的扩展版本：

#[cfg(feature = "exponentials")]
fn parse_operator_extended(tokens: &[&str], index: &mut usize) -> Option<&str> {if *index >= tokens.len() {return None;}let token = tokens[*index];*index += 1;match token {"plus" => Some("+"),"minus" => Some("-"),"multiplied" => {if *index < tokens.len() && tokens[*index] == "by" {*index += 1;Some("*")} else {None}}"divided" => {if *index < tokens.len() && tokens[*index] == "by" {*index += 1;Some("/")} else {None}}"raised" => {if *index + 2 < tokens.len() && tokens[*index] == "to" && tokens[*index + 1] == "the" && tokens[*index + 2].ends_with("power") {*index += 3;Some("^")} else {None}}_ => None,}
}#[cfg(feature = "exponentials")]
fn calculate_with_exponent(base: i32, exponent: i32) -> Option<i32> {if exponent < 0 {return None; // 不处理负指数}let exponent = exponent as u32;if exponent > 31 {return None; // 防止溢出}base.checked_pow(exponent)
}#[cfg(feature = "exponentials")]
fn parse_expression_extended(tokens: &[&str]) -> Option<i32> {let mut index = 0;let mut values: Vec<i32> = Vec::new();let mut operators: Vec<&str> = Vec::new();// 解析第一个数字values.push(parse_number(tokens, &mut index)?);// 解析操作符和数字while index < tokens.len() {let operator = parse_operator_extended(tokens, &mut index)?;let operand = parse_number(tokens, &mut index)?;// 处理指数运算（最高优先级）if operator == "^" {let base = values.pop()?;let result = calculate_with_exponent(base, operand)?;values.push(result);} else {values.push(operand);operators.push(operator);}}// 首先处理乘除运算let mut i = 0;while i < operators.len() {match operators[i] {"*" | "/" => {let left = values.remove(i);let right = values.remove(i);let result = match operators[i] {"*" => left.checked_mul(right)?,"/" => left.checked_div(right)?,_ => unreachable!(),};values.insert(i, result);operators.remove(i);}_ => i += 1,}}// 然后处理加减运算let mut result = values[0];for (i, &operator) in operators.iter().enumerate() {let operand = values[i + 1];result = match operator {"+" => result.checked_add(operand)?,"-" => result.checked_sub(operand)?,_ => unreachable!(),};}Some(result)
}

测试案例详解

通过查看测试案例，我们可以更好地理解函数的行为：

#[test]
fn just_a_number() {let command = "What is 5?";assert_eq!(Some(5), answer(command));
}

最简单的情况：直接询问一个数字。

#[test]
fn addition() {let command = "What is 1 plus 1?";assert_eq!(Some(2), answer(command));
}

基本加法运算。

#[test]
fn addition_with_negative_numbers() {let command = "What is -1 plus -10?";assert_eq!(Some(-11), answer(command));
}

处理负数。

#[test]
fn multiple_additions() {let command = "What is 1 plus 1 plus 1?";assert_eq!(Some(3), answer(command));
}

连续运算。

#[test]
fn addition_and_multiplication() {let command = "What is -3 plus 7 multiplied by -2?";assert_eq!(Some(-8), answer(command));
}

运算符优先级：乘法优先于加法。

#[test]
fn unknown_operation() {let command = "What is 52 cubed?";assert_eq!(None, answer(command));
}

无法识别的操作应返回None。

#[test]
fn reject_problem_missing_an_operand() {let command = "What is 1 plus?";assert_eq!(None, answer(command));
}

语法不完整的表达式应返回None。

#[test]
#[cfg(feature = "exponentials")]
fn exponential() {let command = "What is 2 raised to the 5th power?";assert_eq!(Some(32), answer(command));
}

支持指数运算（需要启用feature）。

Rust语言特性运用

在这个实现中，我们运用了多种Rust语言特性：

Option类型: 使用Option处理可能失败的操作
字符串处理: 使用[starts_with]、[ends_with]、[split_whitespace]等方法处理文本
模式匹配: 使用match表达式处理不同的操作符
错误传播: 使用[?]操作符简化错误处理
溢出检查: 使用[checked_add]、[checked_sub]等方法防止整数溢出
条件编译: 使用[#[cfg(feature = “exponentials”)]]支持可选功能
向量操作: 使用Vec存储值和操作符
引用和切片: 正确处理字符串切片和数组切片

算法原理深入

词法分析

我们的解析器首先进行词法分析，将输入字符串分解为标记（tokens）：

验证输入格式是否正确
提取问题主体部分
按空格分割为单词

语法分析

然后进行语法分析，识别数字和操作符：

解析数字（包括负数）
识别操作符及其参数（如"multiplied by"）
构建表达式结构

运算符优先级

处理运算符优先级有两种常见方法：

递归下降解析：通过函数调用层次体现优先级
调度场算法：使用栈结构处理优先级

我们的实现采用了简化的优先级处理方法。

实际应用场景

Wordy问题在许多实际场景中都有应用：

智能语音助手：理解用户的数学问题
教育软件：数学练习和测试系统
编程语言解释器：构建简单的表达式解析器
聊天机器人：处理数学相关的用户查询
计算器应用：自然语言输入的计算器
数据分析工具：快速计算表达式

扩展功能

我们可以为这个系统添加更多功能：

// 支持括号
fn parse_expression_with_parentheses(tokens: &[&str]) -> Option<i32> {// 实现支持括号的表达式解析// ...
}// 支持更多数学函数
fn parse_function(tokens: &[&str], index: &mut usize) -> Option<&str> {if *index >= tokens.len() {return None;}let token = tokens[*index];*index += 1;match token {"plus" | "minus" | "multiplied" | "divided" => {// 原有操作符// ...}"sqrt" => Some("sqrt"),"abs" => Some("abs"),_ => None,}
}// 支持变量
use std::collections::HashMap;struct Calculator {variables: HashMap<String, i32>,
}impl Calculator {fn new() -> Self {Self {variables: HashMap::new(),}}fn set_variable(&mut self, name: &str, value: i32) {self.variables.insert(name.to_string(), value);}fn evaluate(&self, expression: &str) -> Option<i32> {// 实现支持变量的表达式求值// ...}
}

性能优化

对于大规模使用，我们可以考虑以下优化：

// 使用缓存避免重复解析
use std::collections::HashMap;
use std::sync::Mutex;lazy_static::lazy_static! {static ref CACHE: Mutex<HashMap<String, Option<i32>>> = Mutex::new(HashMap::new());
}pub fn answer_cached(command: &str) -> Option<i32> {{let cache = CACHE.lock().unwrap();if let Some(result) = cache.get(command) {return *result;}}let result = answer(command);{let mut cache = CACHE.lock().unwrap();cache.insert(command.to_string(), result);}result
}

与其他实现方式的比较

Python实现

def answer(command):if not command.startswith("What is ") or not command.endswith('?'):return Nonequestion = command[8:-1]tokens = question.split()if not tokens:return None# 简化的表达式求值# ...

JavaScript实现

function answer(command) {if (!command.startsWith("What is ") || !command.endsWith('?')) {return null;}const question = command.substring(8, command.length - 1);const tokens = question.split(/\s+/);// 表达式解析和求值// ...
}