ML Kit - ML Kit 文字识别(ML Kit 概述、ML Kit 文字识别、文本提取、补充情况)
一、ML Kit 概述
-
ML Kit 是 Google 为移动应用开发者打造的一款机器学习工具包
-
ML Kit 能让开发者相对轻松地在 Android 和 iOS 应用中集成各种机器学习功能
-
ML Kit 包含了文字识别、人脸检测、条形码扫描、图像标签等功能
- ML Kit 官网:
https://developers.google.cn/ml-kit?hl=zh-cn
二、ML Kit 文字识别
1、演示
(1)Dependencies
- 模块级 build.gradle
// 使用拉丁语系文本识别,例如,英文
implementation 'com.google.mlkit:text-recognition:16.0.0'// 额外添加中文包
implementation 'com.google.mlkit:text-recognition-chinese:16.0.0'
(2)Util
- OcrCallback.java
public interface OcrCallback {void onSuccess(String result);void onError(Exception e);
}
- OcrManager.java
public class OcrManager {public static void recognizeChineseText(Bitmap bitmap, OcrCallback ocrCallback) {InputImage inputImage = InputImage.fromBitmap(bitmap, 0);TextRecognizer textRecognizer = TextRecognition.getClient(new ChineseTextRecognizerOptions.Builder().build());textRecognizer.process(inputImage).addOnSuccessListener(new OnSuccessListener<Text>() {@Overridepublic void onSuccess(Text text) {ocrCallback.onSuccess(text.getText());}}).addOnFailureListener(new OnFailureListener() {@Overridepublic void onFailure(@NonNull Exception e) {ocrCallback.onError(e);}});}public static void recognizeLatinText(Bitmap bitmap, OcrCallback ocrCallback) {InputImage inputImage = InputImage.fromBitmap(bitmap, 0);TextRecognizer textRecognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS);textRecognizer.process(inputImage).addOnSuccessListener(new OnSuccessListener<Text>() {@Overridepublic void onSuccess(Text text) {ocrCallback.onSuccess(text.getText());}}).addOnFailureListener(new OnFailureListener() {@Overridepublic void onFailure(@NonNull Exception e) {ocrCallback.onError(e);}});}
}
(3)Test
Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.test_img);OcrManager.recognizeChineseText(bitmap, new OcrCallback() {@Overridepublic void onSuccess(String result) {Log.i(TAG, "识别成功,结果:" + result);}@Overridepublic void onError(Exception e) {Log.i(TAG, "识别失败,原因:" + e.getMessage());}
});

# 输出结果识别成功,结果:超车道
OVERTAKING LANE
行车道
CARRIAGEWAY

# 输出结果识别成功,结果:张三
2、解读
- 从 Bitmap 创建 InputImage
InputImage inputImage = InputImage.fromBitmap(bitmap, 0);
- 获取 TextRecognizer 实例,用于识别中文
TextRecognizer textRecognizer = TextRecognition.getClient(new ChineseTextRecognizerOptions.Builder().build()
);
- 识别中文,添加监听器,OnSuccessListener 与 addOnFailureListener
textRecognizer.process(inputImage).addOnSuccessListener(new OnSuccessListener<Text>() {@Overridepublic void onSuccess(Text text) {...}}).addOnFailureListener(new OnFailureListener() {@Overridepublic void onFailure(@NonNull Exception e) {...}});
- 提取完整文本
ocrCallback.onSuccess(text.getText());
- 从资源文件获取 Bitmap
Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.test_img);
三、文本提取
1、基本介绍
-
如果文本识别操作成功,系统会向 OnSuccessListener 传递一个 Text 对象,Text 对象包含图片中识别到的完整文本以及 0 个或 0 个以上的 TextBlock 对象
-
每个 TextBlock 表示一个矩形文本块,其中包含 0 个或 0 个以上的 Line 对象
-
每个 Line 对象代表一行文本,其中包含零个或零个以上的 Element 对象
-
每个 Element 对象都表示一个字词或类似字词的实体,其中包含零个或零个以上的 Symbol 对象
-
每个 Symbol 对象都表示一个字符、一个数字或类似字词的实体
2、演示
(1)Util
- OcrManager.java
private static void processText(Text text) {for (Text.TextBlock textBlock : text.getTextBlocks()) {Log.i(TAG, "-------------------- textBlock: " + textBlock.getText());for (Text.Line line : textBlock.getLines()) {Log.i(TAG, "--------------- line: " + line.getText());for (Text.Element element : line.getElements()) {Log.i(TAG, "---------- element: " + element.getText());for (Text.Symbol symbol : element.getSymbols()) {Log.i(TAG, "----- symbol: " + symbol.getText());}}}}
}
public static void recognizeChineseText(Bitmap bitmap, OcrCallback ocrCallback) {InputImage inputImage = InputImage.fromBitmap(bitmap, 0);TextRecognizer textRecognizer = TextRecognition.getClient(new ChineseTextRecognizerOptions.Builder().build());textRecognizer.process(inputImage).addOnSuccessListener(new OnSuccessListener<Text>() {@Overridepublic void onSuccess(Text text) {processText(text);ocrCallback.onSuccess(text.getText());}}).addOnFailureListener(new OnFailureListener() {@Overridepublic void onFailure(@NonNull Exception e) {ocrCallback.onError(e);}});
}
(2)Test
Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.test_img);OcrManager.recognizeChineseText(bitmap, new OcrCallback() {@Overridepublic void onSuccess(String result) {Log.i(TAG, "识别成功,结果:" + result);}@Overridepublic void onError(Exception e) {Log.i(TAG, "识别失败,原因:" + e.getMessage());}
});

# 输出结果-------------------- textBlock: 超车道
--------------- line: 超车道
---------- element: 超车道
----- symbol: 超
----- symbol: 车
----- symbol: 道
-------------------- textBlock: OVERTAKING LANE
--------------- line: OVERTAKING LANE
---------- element: OVERTAKING
----- symbol: O
----- symbol: V
----- symbol: E
----- symbol: R
----- symbol: T
----- symbol: A
----- symbol: K
----- symbol: I
----- symbol: N
----- symbol: G
---------- element: LANE
----- symbol: L
----- symbol: A
----- symbol: N
----- symbol: E
-------------------- textBlock: 行车道
--------------- line: 行车道
---------- element: 行车道
----- symbol: 行
----- symbol: 车
----- symbol: 道
-------------------- textBlock: CARRIAGEWAY
--------------- line: CARRIAGEWAY
---------- element: CARRIAGEWAY
----- symbol: C
----- symbol: A
----- symbol: R
----- symbol: R
----- symbol: I
----- symbol: A
----- symbol: G
----- symbol: E
----- symbol: W
----- symbol: A
----- symbol: Y
识别成功,结果:超车道
OVERTAKING LANE
行车道
CARRIAGEWAY

# 输出结果-------------------- textBlock: 张三
--------------- line: 张三
---------- element: 张三
----- symbol: 张
----- symbol: 三
识别成功,结果:张三
3、提取连续文本
(1)Util
- OcrManager.java
public static void recognizeChineseText(Bitmap bitmap, OcrCallback ocrCallback) {InputImage inputImage = InputImage.fromBitmap(bitmap, 0);TextRecognizer textRecognizer = TextRecognition.getClient(new ChineseTextRecognizerOptions.Builder().build());textRecognizer.process(inputImage).addOnSuccessListener(new OnSuccessListener<Text>() {@Overridepublic void onSuccess(Text text) {String result = text.getText();String newResult = result.replaceAll("\\s+", " ") // 将任何空白符序列(包括换行符)替换为单个空格.replaceAll("\\n", "") // 再次确保移除所有换行符.trim(); // 去掉首尾空格ocrCallback.onSuccess(newResult);}}).addOnFailureListener(new OnFailureListener() {@Overridepublic void onFailure(@NonNull Exception e) {ocrCallback.onError(e);}});
}
(2)Test
Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.test_img);OcrManager.recognizeChineseText(bitmap, new OcrCallback() {@Overridepublic void onSuccess(String result) {Log.i(TAG, "识别成功,结果:" + result);}@Overridepublic void onError(Exception e) {Log.i(TAG, "识别失败,原因:" + e.getMessage());}
});

# 输出结果识别成功,结果:超车道 OVERTAKING LANE 行车道 CARRIAGEWAY
四、补充情况
1、Bitmap 获取失败的情况
- 这里从一个不存在的资源文件获取 Bitmap
Bitmap bitmap = BitmapFactory.decodeResource(getResources(), 1001);Log.i(TAG, "bitmap: " + bitmap);OcrManager.recognizeChineseText(bitmap, new OcrCallback() {@Overridepublic void onSuccess(String result) {Log.i(TAG, "识别成功,结果:" + result);}@Overridepublic void onError(Exception e) {Log.i(TAG, "识别失败,原因:" + e.getMessage());}
});
# 输出结果bitmap: null
...
FATAL EXCEPTION: main
Process: com.my.ocr_ml_kit, PID: 29277
java.lang.RuntimeException: Unable to start activity ComponentInfo{com.my.ocr_ml_kit/com.my.ocr_ml_kit.MainActivity}: java.lang.NullPointerException: null reference
- OcrManager.java,完善 recognizeChineseText 方法
public static void recognizeChineseText(Bitmap bitmap, OcrCallback ocrCallback) {if (bitmap == null) {ocrCallback.onError(new IllegalArgumentException("Bitmap is null"));return;}InputImage inputImage = InputImage.fromBitmap(bitmap, 0);TextRecognizer textRecognizer = TextRecognition.getClient(new ChineseTextRecognizerOptions.Builder().build());textRecognizer.process(inputImage).addOnSuccessListener(new OnSuccessListener<Text>() {@Overridepublic void onSuccess(Text text) {if (ocrCallback != null) ocrCallback.onSuccess(text.getText());}}).addOnFailureListener(new OnFailureListener() {@Overridepublic void onFailure(@NonNull Exception e) {if (ocrCallback != null) ocrCallback.onError(e);}});
}
2、识别连笔字
- ML Kit 文字识别,识别连笔字的能力有限,应该使用 ML Kit 数字墨水识别
Bitmap bitmap4_1 = BitmapFactory.decodeResource(getResources(), R.drawable.test4_1);
Bitmap bitmap4_2 = BitmapFactory.decodeResource(getResources(), R.drawable.test4_2);OcrManager.recognizeChineseText(bitmap4_1, new OcrCallback() {@Overridepublic void onSuccess(String result) {Log.i(TAG, "test4_1 识别成功,结果:" + result);}@Overridepublic void onError(Exception e) {Log.i(TAG, "test4_1 识别失败,原因:" + e.getMessage());}
});OcrManager.recognizeChineseText(bitmap4_2, new OcrCallback() {@Overridepublic void onSuccess(String result) {Log.i(TAG, "test4_2 识别成功,结果:" + result);}@Overridepublic void onError(Exception e) {Log.i(TAG, "test4_2 识别失败,原因:" + e.getMessage());}
});
- test4_1.png

- test4_2.png

# 输出结果test4_1 识别成功,结果:
test4_2 识别成功,结果: