当前位置：首页 > news >正文

Rust 练习册：Pig Latin与语言游戏

news 2025/11/9 6:41:50

Pig Latin（猪拉丁语）是一种英语语言游戏，通过改变单词的拼写来创造一种"秘密语言"。它在英语国家非常流行，尤其在儿童中。在 Exercism 的 “pig-latin” 练习中，我们需要实现一个函数来将英语文本转换为 Pig Latin。这不仅能帮助我们掌握字符串处理和模式匹配技巧，还能深入学习Rust中的字符处理、词汇分析和文本转换算法。

什么是 Pig Latin？

Pig Latin 是一种英语语言游戏，规则如下：

以元音开头的单词：在单词末尾加上 “ay”
- apple → appleay
- egg → eggay
以辅音开头的单词：将开头的辅音（或辅音群）移到单词末尾，然后加上 “ay”
- pig → igpay
- school → oolschay
以 “qu” 开头或包含 “qu” 的单词：将 “qu” 视为一个整体处理
- queen → eenquay
- square → aresquay
特殊情况：
- 以 “xr” 或 “yt” 开头的单词被视为以元音开头
- 当 “y” 跟在辅音后面时，被视为元音

让我们先看看练习提供的函数签名：

pub fn translate(input: &str) -> String {unimplemented!("Using the Pig Latin text transformation rules, convert the given input '{}'",input);
}

我们需要实现 translate 函数，将输入的英语文本转换为 Pig Latin。

设计分析

1. 核心要求

元音识别：正确识别英语中的元音字母（a, e, i, o, u）
辅音群处理：处理连续的辅音字母
特殊规则：处理 “qu”, “xr”, “yt” 等特殊情况
短语处理：处理包含多个单词的短语

2. 技术要点

字符处理：高效处理字符串中的字符和子串
模式匹配：使用模式匹配处理不同的转换规则
文本分割：正确分割单词和处理空格
边界情况：处理各种边界情况和特殊规则

完整实现

1. 基础实现

pub fn translate(input: &str) -> String {// 分割单词并转换每个单词input.split_whitespace().map(translate_word).collect::<Vec<String>>().join(" ")
}fn translate_word(word: &str) -> String {// 处理空单词if word.is_empty() {return String::new();}// 检查是否以元音开头（包括特殊规则）if starts_with_vowel_sound(word) {format!("{}ay", word)} else {// 找到辅音群的结束位置let consonant_cluster_end = find_consonant_cluster_end(word);let consonant_cluster = &word[..consonant_cluster_end];let rest = &word[consonant_cluster_end..];format!("{}{}ay", rest, consonant_cluster)}
}fn starts_with_vowel_sound(word: &str) -> bool {let chars: Vec<char> = word.chars().collect();// 检查第一个字母是否为元音if chars.is_empty() {return false;}match chars[0] {'a' | 'e' | 'i' | 'o' | 'u' => true,'x' | 'y' => {// 特殊情况：xr 和 yr 被视为元音开头if chars.len() > 1 {chars[1] == 'r'} else {false}}_ => false,}
}fn find_consonant_cluster_end(word: &str) -> usize {let chars: Vec<char> = word.chars().collect();let mut i = 0;while i < chars.len() {match chars[i] {'a' | 'e' | 'i' | 'o' | 'u' => {// 特殊情况：如果辅音后跟着qu，需要一起移动if i > 0 && chars[i] == 'u' && i > 0 && chars[i-1] == 'q' {// 继续，因为qu应该被视为一个整体i += 1;continue;}break;}'y' => {// y在辅音后被视为元音if i > 0 {break;}i += 1;}_ => i += 1,}}// 计算字符边界位置word.char_indices().nth(i).map(|(pos, _)| pos).unwrap_or(word.len())
}fn char_to_byte_index(word: &str, char_index: usize) -> usize {word.char_indices().nth(char_index).map(|(pos, _)| pos).unwrap_or(word.len())
}

2. 优化实现

pub fn translate(input: &str) -> String {input.split_whitespace().map(translate_word).collect::<Vec<String>>().join(" ")
}fn translate_word(word: &str) -> String {if word.is_empty() {return String::new();}let consonant_cluster_end = find_consonant_cluster_end(word);if consonant_cluster_end == 0 {// 以元音开头format!("{}ay", word)} else {// 以辅音开头let (consonant_cluster, rest) = word.split_at(consonant_cluster_end);format!("{}{}ay", rest, consonant_cluster)}
}fn find_consonant_cluster_end(word: &str) -> usize {let chars: Vec<char> = word.chars().collect();let mut i = 0;// 特殊规则：以xr或yt开头的单词被视为以元音开头if chars.len() >= 2 && ((chars[0] == 'x' && chars[1] == 'r') || (chars[0] == 'y' && chars[1] == 't')) {return 0;}while i < chars.len() {match chars[i] {'a' | 'e' | 'i' | 'o' | 'u' => {break;}'y' => {// y在不是第一个字母时被视为元音if i > 0 {break;}i += 1;}'q' => {// 特殊处理qu组合if i + 1 < chars.len() && chars[i + 1] == 'u' {i += 2; // 跳过qubreak;}i += 1;}_ => i += 1,}}// 转换字符索引为字节索引word.char_indices().nth(i).map(|(pos, _)| pos).unwrap_or(word.len())
}

3. 函数式实现

pub fn translate(input: &str) -> String {input.split_whitespace().map(|word| {let chars: Vec<char> = word.chars().collect();if chars.is_empty() {return String::new();}// 特殊规则：以xr或yt开头if chars.len() >= 2 && ((chars[0] == 'x' && chars[1] == 'r') || (chars[0] == 'y' && chars[1] == 't')) {return format!("{}ay", word);}// 查找元音位置let vowel_index = chars.iter().enumerate().find(|(i, &c)| {match c {'a' | 'e' | 'i' | 'o' | 'u' => true,'y' => *i > 0, // y在非首位时为元音'q' => {// 特殊处理quif *i + 1 < chars.len() && chars[*i + 1] == 'u' {true} else {false}}_ => false,}}).map(|(index, c)| {if c == 'q' && index + 1 < chars.len() && chars[index + 1] == 'u' {index + 2 // 跳过qu} else {index}}).unwrap_or(chars.len());if vowel_index == 0 {format!("{}ay", word)} else {let (consonant_cluster, rest) = word.split_at(word.char_indices().nth(vowel_index).map(|(pos, _)| pos).unwrap_or(word.len()));format!("{}{}ay", rest, consonant_cluster)}}).collect::<Vec<String>>().join(" ")
}

测试用例分析

通过查看测试用例，我们可以更好地理解需求：

#[test]
fn test_word_beginning_with_a() {assert_eq!(pl::translate("apple"), "appleay");
}

以元音"a"开头的单词在末尾加上"ay"。

#[test]
fn test_word_beginning_with_e() {assert_eq!(pl::translate("ear"), "earay");
}

以元音"e"开头的单词在末尾加上"ay"。

#[test]
fn test_word_beginning_with_i() {assert_eq!(pl::translate("igloo"), "iglooay");
}

以元音"i"开头的单词在末尾加上"ay"。

#[test]
fn test_word_beginning_with_o() {assert_eq!(pl::translate("object"), "objectay");
}

以元音"o"开头的单词在末尾加上"ay"。

#[test]
fn test_word_beginning_with_u() {assert_eq!(pl::translate("under"), "underay");
}

以元音"u"开头的单词在末尾加上"ay"。

#[test]
fn test_word_beginning_with_a_vowel_and_followed_by_a_qu() {assert_eq!(pl::translate("equal"), "equalay");
}

以元音开头的单词即使包含"qu"也按元音规则处理。

#[test]
fn test_word_beginning_with_p() {assert_eq!(pl::translate("pig"), "igpay");
}

以辅音开头的单词将辅音移到末尾并加上"ay"。

#[test]
fn test_word_beginning_with_k() {assert_eq!(pl::translate("koala"), "oalakay");
}

以辅音"k"开头的单词将辅音移到末尾并加上"ay"。

#[test]
fn test_word_beginning_with_y() {assert_eq!(pl::translate("yellow"), "ellowyay");
}

以"y"开头的单词将"y"视为辅音处理。

#[test]
fn test_word_beginning_with_x() {assert_eq!(pl::translate("xenon"), "enonxay");
}

以"x"开头的单词将"x"视为辅音处理。

#[test]
fn test_word_beginning_with_q_without_a_following_u() {assert_eq!(pl::translate("qat"), "atqay");
}

"q"后不跟"u"时，单独作为辅音处理。

#[test]
fn test_word_beginning_with_ch() {assert_eq!(pl::translate("chair"), "airchay");
}

以辅音群"ch"开头的单词将整个辅音群移到末尾。

#[test]
fn test_word_beginning_with_qu() {assert_eq!(pl::translate("queen"), "eenquay");
}

以"qu"开头的单词将"qu"作为一个整体移到末尾。

#[test]
fn test_word_beginning_with_qu_and_a_preceding_consonant() {assert_eq!(pl::translate("square"), "aresquay");
}

辅音后跟"qu"的单词将辅音和"qu"一起移到末尾。

#[test]
fn test_word_beginning_with_th() {assert_eq!(pl::translate("therapy"), "erapythay");
}

以辅音群"th"开头的单词将整个辅音群移到末尾。

#[test]
fn test_word_beginning_with_thr() {assert_eq!(pl::translate("thrush"), "ushthray");
}

以辅音群"thr"开头的单词将整个辅音群移到末尾。

#[test]
fn test_word_beginning_with_sch() {assert_eq!(pl::translate("school"), "oolschay");
}

以辅音群"sch"开头的单词将整个辅音群移到末尾。

#[test]
fn test_word_beginning_with_yt() {assert_eq!(pl::translate("yttria"), "yttriaay");
}

以"yt"开头的单词被视为以元音开头。

#[test]
fn test_word_beginning_with_xr() {assert_eq!(pl::translate("xray"), "xrayay");
}

以"xr"开头的单词被视为以元音开头。

#[test]
fn test_y_is_treated_like_a_vowel_at_the_end_of_a_consonant_cluster() {assert_eq!(pl::translate("rhythm"), "ythmrhay");
}

当"y"跟在辅音后面时，被视为元音。

#[test]
fn test_a_whole_phrase() {assert_eq!(pl::translate("quick fast run"), "ickquay astfay unray");
}

短语中的每个单词都应单独转换。

性能优化版本

考虑性能的优化实现：

pub fn translate(input: &str) -> String {// 预分配结果字符串以避免多次重新分配let mut result = String::with_capacity(input.len() + input.len() / 4);let mut first_word = true;for word in input.split_whitespace() {if !first_word {result.push(' ');} else {first_word = false;}translate_word_in_place(word, &mut result);}result
}fn translate_word_in_place(word: &str, result: &mut String) {if word.is_empty() {return;}let consonant_cluster_end = find_consonant_cluster_end_optimized(word);if consonant_cluster_end == 0 {// 以元音开头result.push_str(word);result.push_str("ay");} else {// 以辅音开头let consonant_cluster = &word[..consonant_cluster_end];let rest = &word[consonant_cluster_end..];result.push_str(rest);result.push_str(consonant_cluster);result.push_str("ay");}
}fn find_consonant_cluster_end_optimized(word: &str) -> usize {let bytes = word.as_bytes();let mut i = 0;// 特殊规则：以xr或yt开头if bytes.len() >= 2 && ((bytes[0] == b'x' && bytes[1] == b'r') || (bytes[0] == b'y' && bytes[1] == b't')) {return 0;}while i < bytes.len() {match bytes[i] {b'a' | b'e' | b'i' | b'o' | b'u' => {break;}b'y' => {// y在不是第一个字母时被视为元音if i > 0 {break;}i += 1;}b'q' => {// 特殊处理qu组合if i + 1 < bytes.len() && bytes[i + 1] == b'u' {i += 2; // 跳过qubreak;}i += 1;}_ => i += 1,}}i
}// 使用查找表的版本
static VOWELS: [bool; 256] = {let mut array = [false; 256];array[b'a' as usize] = true;array[b'e' as usize] = true;array[b'i' as usize] = true;array[b'o' as usize] = true;array[b'u' as usize] = true;array
};fn find_consonant_cluster_end_lookup(word: &str) -> usize {let bytes = word.as_bytes();let mut i = 0;// 特殊规则：以xr或yt开头if bytes.len() >= 2 && ((bytes[0] == b'x' && bytes[1] == b'r') || (bytes[0] == b'y' && bytes[1] == b't')) {return 0;}while i < bytes.len() {let byte = bytes[i];if VOWELS[byte as usize] {break;}match byte {b'y' => {// y在不是第一个字母时被视为元音if i > 0 {break;}i += 1;}b'q' => {// 特殊处理qu组合if i + 1 < bytes.len() && bytes[i + 1] == b'u' {i += 2; // 跳过qubreak;}i += 1;}_ => i += 1,}}i
}

错误处理和边界情况

考虑更多边界情况的实现：

pub fn translate(input: &str) -> String {// 处理空输入if input.is_empty() {return String::new();}input.split_whitespace().map(translate_word_safe).collect::<Vec<String>>().join(" ")
}fn translate_word_safe(word: &str) -> String {// 处理空单词if word.is_empty() {return String::new();}// 处理只包含标点符号的单词if !word.chars().any(|c| c.is_alphabetic()) {return word.to_string();}// 分离单词和后缀标点符号let (clean_word, punctuation) = split_word_and_punctuation(word);if clean_word.is_empty() {return word.to_string();}let translated = translate_clean_word(&clean_word);format!("{}{}", translated, punctuation)
}fn split_word_and_punctuation(word: &str) -> (String, String) {let chars: Vec<char> = word.chars().collect();let mut split_index = chars.len();// 从后向前查找第一个字母for (i, &c) in chars.iter().enumerate().rev() {if c.is_alphabetic() {split_index = i + 1;break;}}let clean_word: String = chars[..split_index].iter().collect();let punctuation: String = chars[split_index..].iter().collect();(clean_word, punctuation)
}fn translate_clean_word(word: &str) -> String {let consonant_cluster_end = find_consonant_cluster_end(word);if consonant_cluster_end == 0 {format!("{}ay", word)} else {let (consonant_cluster, rest) = word.split_at(consonant_cluster_end);format!("{}{}ay", rest, consonant_cluster)}
}fn find_consonant_cluster_end(word: &str) -> usize {let chars: Vec<char> = word.chars().collect();// 特殊规则：以xr或yt开头if chars.len() >= 2 && ((chars[0] == 'x' && chars[1] == 'r') || (chars[0] == 'y' && chars[1] == 't')) {return 0;}let mut i = 0;while i < chars.len() {match chars[i] {'a' | 'e' | 'i' | 'o' | 'u' => {break;}'y' => {// y在不是第一个字母时被视为元音if i > 0 {break;}i += 1;}'q' => {// 特殊处理qu组合if i + 1 < chars.len() && chars[i + 1] == 'u' {i += 2; // 跳过qubreak;}i += 1;}_ => i += 1,}}// 转换字符索引为字节索引word.char_indices().nth(i).map(|(pos, _)| pos).unwrap_or(word.len())
}// 返回Result的版本
#[derive(Debug, PartialEq)]
pub enum TranslationError {EmptyInput,InvalidCharacter,
}impl std::fmt::Display for TranslationError {fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {match self {TranslationError::EmptyInput => write!(f, "输入为空"),TranslationError::InvalidCharacter => write!(f, "输入包含无效字符"),}}
}impl std::error::Error for TranslationError {}pub fn translate_safe(input: &str) -> Result<String, TranslationError> {// 处理空输入if input.is_empty() {return Err(TranslationError::EmptyInput);}// 检查是否包含非ASCII字符（简化检查）if !input.is_ascii() {// 这里可以添加更复杂的检查}Ok(translate(input))
}

扩展功能

基于基础实现，我们可以添加更多功能：

pub struct PigLatinTranslator {preserve_case: bool,preserve_punctuation: bool,
}impl PigLatinTranslator {pub fn new() -> Self {PigLatinTranslator {preserve_case: true,preserve_punctuation: true,}}pub fn with_case_preservation(mut self, preserve: bool) -> Self {self.preserve_case = preserve;self}pub fn with_punctuation_preservation(mut self, preserve: bool) -> Self {self.preserve_punctuation = preserve;self}pub fn translate(&self, input: &str) -> String {if input.is_empty() {return String::new();}input.split_whitespace().map(|word| self.translate_word(word)).collect::<Vec<String>>().join(" ")}fn translate_word(&self, word: &str) -> String {if word.is_empty() {return String::new();}if self.preserve_punctuation {self.translate_word_with_punctuation(word)} else {self.translate_clean_word(word)}}fn translate_word_with_punctuation(&self, word: &str) -> String {// 分离单词和标点符号let (clean_word, punctuation) = self.split_word_and_punctuation(word);if clean_word.is_empty() {return word.to_string();}let translated = self.translate_clean_word(&clean_word);format!("{}{}", translated, punctuation)}fn split_word_and_punctuation(&self, word: &str) -> (String, String) {let chars: Vec<char> = word.chars().collect();let mut split_index = chars.len();// 从后向前查找第一个字母for (i, &c) in chars.iter().enumerate().rev() {if c.is_alphabetic() {split_index = i + 1;break;}}let clean_word: String = chars[..split_index].iter().collect();let punctuation: String = chars[split_index..].iter().collect();(clean_word, punctuation)}fn translate_clean_word(&self, word: &str) -> String {let consonant_cluster_end = self.find_consonant_cluster_end(word);if consonant_cluster_end == 0 {format!("{}ay", word)} else {let (consonant_cluster, rest) = word.split_at(consonant_cluster_end);format!("{}{}ay", rest, consonant_cluster)}}fn find_consonant_cluster_end(&self, word: &str) -> usize {let chars: Vec<char> = word.chars().collect();// 特殊规则：以xr或yt开头if chars.len() >= 2 && ((chars[0] == 'x' && chars[1] == 'r') || (chars[0] == 'y' && chars[1] == 't')) {return 0;}let mut i = 0;while i < chars.len() {match chars[i] {'a' | 'e' | 'i' | 'o' | 'u' => {break;}'y' => {// y在不是第一个字母时被视为元音if i > 0 {break;}i += 1;}'q' => {// 特殊处理qu组合if i + 1 < chars.len() && chars[i + 1] == 'u' {i += 2; // 跳过qubreak;}i += 1;}_ => i += 1,}}// 转换字符索引为字节索引word.char_indices().nth(i).map(|(pos, _)| pos).unwrap_or(word.len())}// 反向转换：从Pig Latin转换回英语（简化版）pub fn reverse_translate(&self, input: &str) -> String {input.split_whitespace().map(|word| self.reverse_translate_word(word)).collect::<Vec<String>>().join(" ")}fn reverse_translate_word(&self, word: &str) -> String {if !word.ends_with("ay") {return word.to_string();}let base_word = &word[..word.len() - 2]; // 移除"ay"// 简化实现：假设所有以元音结尾的词是以元音开头的// 实际实现会更复杂if base_word.ends_with("ay") || base_word.ends_with("ey") || base_word.ends_with("iy") || base_word.ends_with("oy") || base_word.ends_with("uy") {base_word.to_string()} else {// 简化的辅音处理base_word.to_string()}}
}// 单词分析器
pub struct WordAnalysis {pub original: String,pub translated: String,pub starts_with_vowel: bool,pub consonant_cluster: String,pub cluster_length: usize,
}impl PigLatinTranslator {pub fn analyze_word(&self, word: &str) -> WordAnalysis {let translated = self.translate_word(word);let consonant_cluster_end = self.find_consonant_cluster_end(word);let starts_with_vowel = consonant_cluster_end == 0;let consonant_cluster = if starts_with_vowel {String::new()} else {word[..consonant_cluster_end].to_string()};WordAnalysis {original: word.to_string(),translated,starts_with_vowel,consonant_cluster,cluster_length: consonant_cluster.chars().count(),}}
}// 便利函数
pub fn translate(input: &str) -> String {let translator = PigLatinTranslator::new();translator.translate(input)
}pub fn translate_with_options(input: &str, preserve_case: bool, preserve_punctuation: bool) -> String {let translator = PigLatinTranslator::new().with_case_preservation(preserve_case).with_punctuation_preservation(preserve_punctuation);translator.translate(input)
}

实际应用场景

Pig Latin在实际开发中有以下应用：

教育软件：语言学习和教学工具
儿童游戏：儿童编程和语言游戏应用
文本处理：自然语言处理中的文本变换
隐私保护：简单的文本混淆和隐私保护
娱乐应用：趣味文本转换和社交媒体应用
语言学研究：语言游戏和语言变换研究
密码学：简单的编码和解码练习
编程教学：字符串处理和算法教学示例

算法复杂度分析

时间复杂度：O(n)
- 其中n是输入文本的字符数，需要遍历每个字符
空间复杂度：O(n)
- 需要存储转换后的结果字符串

与其他实现方式的比较

// 使用正则表达式的实现
use regex::Regex;pub fn translate_regex(input: &str) -> String {let re = Regex::new(r"\b([aeiouAEIOU][a-zA-Z]*)|([b-df-hj-np-tv-zB-DF-HJ-NP-TV-Z]+)([a-zA-Z]*)\b").unwrap();re.replace_all(input, |caps: &regex::Captures| {if let Some(vowel_word) = caps.get(1) {// 以元音开头的单词format!("{}ay", vowel_word.as_str())} else {// 以辅音开头的单词let consonant_cluster = caps.get(2).unwrap().as_str();let rest = caps.get(3).unwrap().as_str();format!("{}{}ay", rest, consonant_cluster)}}).to_string()
}// 使用nom解析器的实现
use nom::{character::complete::{alpha1},combinator::{recognize, opt},sequence::{tuple},bytes::complete::tag,multi::many0,branch::alt,IResult,
};pub fn translate_nom(input: &str) -> String {// 使用nom解析器库实现Pig Latin转换// 这里只是一个示例，实际实现会更复杂unimplemented!()
}// 使用状态机的实现
#[derive(Debug, Clone, Copy)]
enum TranslationState {Start,VowelStart,ConsonantStart,ConsonantCluster,ReadingRest,
}pub fn translate_state_machine(input: &str) -> String {let mut result = String::new();let mut current_word = String::new();let mut in_word = false;for c in input.chars() {if c.is_whitespace() {if in_word {// 处理当前单词let translated = translate_single_word(&current_word);result.push_str(&translated);current_word.clear();in_word = false;}result.push(c);} else {if !in_word {in_word = true;}current_word.push(c);}}// 处理最后一个单词if in_word {let translated = translate_single_word(&current_word);result.push_str(&translated);}result
}fn translate_single_word(word: &str) -> String {// 这里复用之前的实现let consonant_cluster_end = find_consonant_cluster_end(word);if consonant_cluster_end == 0 {format!("{}ay", word)} else {let (consonant_cluster, rest) = word.split_at(consonant_cluster_end);format!("{}{}ay", rest, consonant_cluster)}
}// 使用第三方库的实现
// [dependencies]
// inflector = "0.11"pub fn translate_with_library(input: &str) -> String {// 使用第三方库进行文本处理// 这里只是一个示例unimplemented!()
}