当前位置: 首页 > news >正文

微软云语音识别ASR示例Demo

对象存储服务 OSS 对应    Azure Blob Storage

语音识别 ASR 对应   Azure Speech-to-Text

语音合成 TTS 对应   Azure Text-to-Speech

上传..mp3文件或者上传OSS地址  返回音频的文字示例demo

依赖

<dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-webflux</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><!--   microsoft ASR     --><dependency><groupId>com.microsoft.cognitiveservices.speech</groupId><artifactId>client-sdk</artifactId><version>1.43.0</version></dependency><dependency><groupId>org.projectlombok</groupId><artifactId>lombok</artifactId><optional>true</optional></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-test</artifactId><scope>test</scope></dependency><dependency><groupId>io.projectreactor</groupId><artifactId>reactor-test</artifactId><scope>test</scope></dependency></dependencies>

代码    在application.properties或者yaml中配置key和endpoint

package com.example.microsoftasr.controller;import com.microsoft.cognitiveservices.speech.*;
import com.microsoft.cognitiveservices.speech.audio.AudioConfig;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;import java.io.File;
import java.net.URI;
import java.nio.file.Files;@RestController
@RequestMapping("/asr")
public class TestController {@Value("${azure.speech.key}")private String speechKey;@Value("${azure.speech.endpoint}")private String speechEndpoint;@GetMapping("/hello")public String test() {return "Hello World";}@PostMapping("/recognize")public String recognize(@RequestParam(value = "file", required = false) MultipartFile file,@RequestParam(value = "url", required = false) String ossUrl) {if ((file == null || file.isEmpty()) && (ossUrl == null || ossUrl.isBlank())) {return "未提供音频文件或音频地址";}File tempInput = null;File tempWav = null;try {// 1. 保存临时原始音频if (file != null && !file.isEmpty()) {String suffix = getSuffix(file.getOriginalFilename());tempInput = File.createTempFile("audio-input-", "." + suffix);file.transferTo(tempInput);} else {String suffix = getSuffix(ossUrl);tempInput = File.createTempFile("audio-input-", "." + suffix);try (var in = new java.net.URL(ossUrl).openStream()) {Files.copy(in, tempInput.toPath(), java.nio.file.StandardCopyOption.REPLACE_EXISTING);}}// 2. 转换成 WAV(16kHz 单声道)tempWav = File.createTempFile("audio-output-", ".wav");if (!getSuffix(tempInput.getName()).equalsIgnoreCase("wav")) {ProcessBuilder pb = new ProcessBuilder("F:\\ffmpeg-7.1.1-full_build\\ffmpeg-7.1.1-full_build\\bin\\ffmpeg.exe", "-y","-i", tempInput.getAbsolutePath(),"-ar", "16000","-ac", "1",tempWav.getAbsolutePath());Process process = pb.inheritIO().start();int exitCode = process.waitFor();if (exitCode != 0) return "ffmpeg 转换失败,exitCode=" + exitCode;} else {Files.copy(tempInput.toPath(), tempWav.toPath(), java.nio.file.StandardCopyOption.REPLACE_EXISTING);}// 3. 调用微软 ASR 识别SpeechConfig speechConfig = SpeechConfig.fromEndpoint(new URI(speechEndpoint), speechKey);speechConfig.setSpeechRecognitionLanguage("zh-CN");try (AudioConfig audioConfig = AudioConfig.fromWavFileInput(tempWav.getAbsolutePath());SpeechRecognizer recognizer = new SpeechRecognizer(speechConfig, audioConfig)) {SpeechRecognitionResult result = recognizer.recognizeOnceAsync().get();if (result.getReason() == ResultReason.RecognizedSpeech) {return result.getText();} else {return "识别失败: " + result.getReason();}}} catch (Exception e) {e.printStackTrace();return "识别异常: " + e.getMessage();} finally {try {if (tempInput != null) Files.deleteIfExists(tempInput.toPath());if (tempWav != null) Files.deleteIfExists(tempWav.toPath());} catch (Exception ex) {ex.printStackTrace();}}}private String getSuffix(String filenameOrUrl) {if (filenameOrUrl == null || !filenameOrUrl.contains(".")) return "tmp";return filenameOrUrl.substring(filenameOrUrl.lastIndexOf('.') + 1);}}

http://www.dtcms.com/a/271740.html

相关文章:

  • 激活函数与损失函数:神经网络的动力引擎与导航系统
  • defer学习指南
  • 《C++初阶之内存管理》【内存分布 + operator new/delete + 定位new】
  • 启辰智慧预约团队5周年活动掠影,打造一流预约系统
  • 论文精读(一)| 量子计算系统软件研究综述
  • IoT 小程序:如何破解设备互联的碎片化困局?
  • 一条Redis命令是如何执行的?
  • 两种方式清除已经保存的git账号密码
  • 并发编程第一节
  • 【WEB】Polar靶场 Day7 详细笔记
  • 深度学习模型表征提取全解析
  • 【PyTorch】PyTorch中数据准备工作(AI生成)
  • 内置函数(Python)
  • 树莓派免密登录(vs code/cursor)
  • EFK/ELK9.0.3 windows搭建
  • 【DB2】load报错SQL3501W、SQL3109N、SQL2036N
  • 【算法训练营Day10】栈与队列part2
  • SpringBoot mybatis
  • Idea如何解决包冲突
  • P8818 [CSP-S 2022] 策略游戏
  • 【自动驾驶】经典LSS算法解析——深度估计
  • 自动驾驶决策与规划
  • Git基本操作1
  • 【C++】第四章—— 函数重载 Function Overloading 笔记
  • 2、Unity插件整合
  • 智慧物流管理:动作识别与包装检测的协同突破
  • 射频信号(大宽高比)时频图目标检测anchors配置
  • SpringBoot实现MCP
  • 【Linux网络编程】Socket - TCP
  • 【通识】NodeJS基础