当前位置：首页 > news >正文

LinuxShell grep 查询与正则匹配

news 2025/10/31 13:50:36

linux Shell 中 grep 命令与正则表达式的结合使用，是一个非常强大且常用的文本搜索工具组合
第一部分：grep 命令详解
grep (Global Regular Expression Print) 用于在文件中搜索指定的文本模式。

1. 基本语法
grep [选项] "模式" [文件...]

2. 常用选项
选项   全称   说明
-i   --ignore-case   忽略大小写
-v   --invert-match   反向选择，显示不包含模式的行
-n   --line-number   显示匹配行的行号
-c   --count   只显示匹配到的行数，不显示内容
-l   --files-with-matches   只显示包含模式的文件名，不显示具体行
-L   --files-without-match   只显示不包含模式的文件名
-r   --recursive   递归搜索子目录
-R   --dereference-recursive   同 -r，但会跟随符号链接
-w   --word-regexp   全字匹配，模式必须作为一个完整的单词
-x   --line-regexp   整行匹配，整行必须完全符合模式
-A n   --after-context=n   显示匹配行及其后 n 行
-B n   --before-context=n   显示匹配行及其前 n 行
-C n   --context=n   显示匹配行及其前后各 n 行
-E   --extended-regexp   使用扩展正则表达式
-F   --fixed-strings   将模式视为固定字符串，而非正则表达式（更快）
-G   --basic-regexp   使用基本正则表达式（默认）
-P   --perl-regexp   使用 Perl 兼容的正则表达式（功能最强）
-e   --regexp=PATTERN   指定多个模式或用于以 - 开头的模式
-f file   --file=file   从文件中读取模式
--color=auto       对匹配的文本着色

使用案例:
创建一个文本在里面加入如下内容
[root@hadoop ~]# touch grep_test.txt
[root@hadoop ~]# vim grep_test.txt
[root@hadoop ~]# cat grep_test.txt
liuyifei
liyifei
aiyifei
yangmi
liushishi
connie kart
canglaoshi
xiao ze ma li ya
longze luola
jin ye ya li sha
li haha
ai ni yiwan nian
ai ni yi wan nian
LiuYIFei
LiuYIFei1230
Liuyifei00#*1!
Hajimi

[root@hadoop ~]# grep -i liuyifei grep_test.txt 略大小写查找出包含liuyife大小写的行
liuyifei
LiuYIFei
LiuYIFei1230
Liuyifei00#*1!

[root@hadoop ~]# grep -v liuyifei grep_test.txt 取出不包含liuyifei 的行大写的不会过滤
liyifei
aiyifei
yangmi
liushishi
connie kart
canglaoshi
xiao ze ma li ya
longze luola
jin ye ya li sha
li haha
ai ni yiwan nian
ai ni yi wan nian
LiuYIFei
LiuYIFei1230
Liuyifei00#*1!
Hajimi
[root@hadoop ~]# grep -v -i liuyifei grep_test.txt 取出不包含liuyifei 的行大写的会过滤
liyifei
aiyifei
yangmi
liushishi
connie kart
canglaoshi
xiao ze ma li ya
longze luola
jin ye ya li sha
li haha
ai ni yiwan nian
ai ni yi wan nian
Hajimi

[root@hadoop ~]# grep -v -i -n liuyifei grep_test.txt -n 显示行号
2:liyifei
3:aiyifei
4:yangmi
5:liushishi
6:connie kart
7:canglaoshi
8:xiao ze ma li ya
9:longze luola
10:jin ye ya li sha
11:li haha
12:ai ni yiwan nian
13:ai ni yi wan nian
17:Hajimi

[root@hadoop ~]# grep -c -i liuyifei grep_test.txt 只显示行数，不显示具体内容
1
[root@hadoop ~]# grep -c -i liuyifei grep_test.txt
4
[root@hadoop ~]# grep -l liuyifei grep_test.txt 123.txt 显示匹配的内容所在的文件名
grep_test.txt
grep: 123.txt: 没有那个文件或目录

[root@hadoop ~]# grep -L liuyifei grep_test.txt user1.txt user2.txt 显示不包含匹配内容的文件名
user1.txt
user2.txt

[root@hadoop ~]# grep -w liuyifei grep_test.txt user1.txt user2.txt 列出完全匹配的文件名和匹配值
grep_test.txt:liuyifei

[root@hadoop ~]# grep -x liuyifei grep_test.txt user1.txt user2.txt 列出完全整行匹配的文件名
grep_test.txt:liuyifei

[root@hadoop ~]# grep liyifei -A 10 grep_test.txt
liyifei
aiyifei
yangmi
liushishi
connie kart
canglaoshi
xiao ze ma li ya
longze luola
jin ye ya li sha
li haha
ai ni yiwan nian
[root@hadoop ~]# grep liyifei -n -A 10 grep_test.txt 匹配数据之后的10行
2:liyifei
3-aiyifei
4-yangmi
5-liushishi
6-connie kart
7-canglaoshi
8-xiao ze ma li ya
9-longze luola
10-jin ye ya li sha
11-li haha
12-ai ni yiwan nian

[root@hadoop ~]# grep haha -n -B 10 grep_test.txt 匹配数据之前的10行
1-liuyifei
2-liyifei
3-aiyifei
4-yangmi
5-liushishi
6-connie kart
7-canglaoshi
8-xiao ze ma li ya
9-longze luola
10-jin ye ya li sha
11:li haha
[root@hadoop ~]#

[root@hadoop ~]# grep haha -n -C 10 grep_test.txt 匹配数据前后的10行
1-liuyifei
2-liyifei
3-aiyifei
4-yangmi
5-liushishi
6-connie kart
7-canglaoshi
8-xiao ze ma li ya
9-longze luola
10-jin ye ya li sha
11:li haha
12-ai ni yiwan nian
13-ai ni yi wan nian
14-LiuYIFei
15-LiuYIFei1230
16-Liuyifei00#*1!
17-Hajimi
18-
19-
20-.bash_history:ls -n /usr/local/mysql/bin/mysql
21-.bash_history:/usr/local/mysql/bin/mysql -uroot -p

以下几个需要配合正则表达式来使用，需要先介绍正则
-E   --extended-regexp   使用扩展正则表达式
-F   --fixed-strings   将模式视为固定字符串，而非正则表达式（更快）
-G   --basic-regexp   使用基本正则表达式（默认）
-P   --perl-regexp   使用 Perl 兼容的正则表达式（功能最强）
-e   --regexp=PATTERN   指定多个模式或用于以 - 开头的模式

=========================================================
第二部分：正则表达式详解
正则表达式是一种用于描述文本模式的强大语言。grep 默认使用基本正则表达式，
使用 -E 选项启用功能更丰富的扩展正则表达式，
使用 -P 选项启用功能最强大的 Perl 正则表达式

1. 基本正则表达式
元字符   说明   示例
.   匹配任意一个字符（除换行符）   gr.y 匹配 "gray", "grey", "gr8y"
*   匹配前一个字符0次或多次   go*gle 匹配 "ggle", "gogle", "google"
^   锚定行首   ^Hello 匹配以 "Hello" 开头的行
$   锚定行尾   world$ 匹配以 "world" 结尾的行
[]   匹配括号内的任意一个字符   [Gg]oogle 匹配 "Google" 或 "google"
[^]   匹配不在括号内的任意一个字符   [^0-9] 匹配任意非数字字符
\   转义字符，使特殊字符失去特殊意义   google\.com 匹配字面的 "google.com"
\<   锚定单词的词首   \<the 匹配以 "the" 开头的单词
\>   锚定单词的词尾   the\> 匹配以 "the" 结尾的单词
\{n\}   匹配前一个字符恰好 n 次   o\{2\} 匹配 "google" 中的两个 'o'
\{n,\}   匹配前一个字符至少 n 次   o\{2,\} 匹配 "gooogle" 中的所有 'o'
\{n,m\}   匹配前一个字符 n 到 m 次   o\{2,4\} 匹配 "gooogle", "goooogle"

匹配一个字符
[root@hadoop ~]# grep .fei grep_test.txt
liuyifei
liyifei
aiyifei
Liuyifei00#*1!
[root@hadoop ~]# grep y.fei grep_test.txt
liuyifei
liyifei
aiyifei
Liuyifei00#*1!
[root@hadoop ~]# grep i.fei grep_test.txt
liuyiiiiiiiiiifei

[root@hadoop ~]# grep i*fei grep_test.txt 匹配中间是0个或多个字符
liuyifei
liyifei
aiyifei
liuyiiiiiiiiiifei
Liuyifei00#*1!

[root@hadoop ~]# grep i*\!$ grep_test.txt 以感叹号结尾的行
Liuyifei00#*1!

[root@hadoop ~]# grep [\#\!\i] grep_test.txt 匹配[]内部任意一个字符的行
liuyifei
liyifei
aiyifei
yangmi
liushishi
connie kart
canglaoshi
xiao ze ma li ya
liuyiiiiiiiiiifei
jin ye ya li sha
li haha
ai ni yiwan nian
ai ni yi wan nian
LiuYIFei
LiuYIFei1230
Liuyifei00#*1!

[root@hadoop ~]# grep ^[y] grep_test.txt 以y 开头的
yangmi

匹配不包含b,a,s,h中任意字符的行
grep -v "[bash]" grep_test.txt
排除包含"bash"的行
grep -v "bash" grep_test.txt

匹配不以"bash"开头的行
grep -v "^bash" grep_test.txt
匹配完全不是"bash"的行：
grep -v "^bash$" grep_test.txt

grep [^bash] grep_test.txt 会输出所有的行
"bash" 只包含 b,a,s,h 字符，不匹配
其他行都包含至少一个非b,a,s,h字符

[root@hadoop ~]# grep -i "\<li" grep_test.txt 不区分大小写获取以包含li 或开头的行
liuyifei
liyifei
liushishi
xiao ze ma li ya
liuyiiiiiiiiiifei
jin ye ya li sha
li haha
LiuYIFei
LiuYIFei1230
Liuyifei00#*1!
[root@hadoop ~]# grep "\<li" grep_test.txt 以"li"开头的单词
liuyifei
liyifei
liushishi
xiao ze ma li ya
liuyiiiiiiiiiifei
jin ye ya li sha
li haha

[root@hadoop ~]# grep "ha\>" grep_test.txt 以"ha"结尾的单词
jin ye ya li sha
li haha

匹配完整的单词"ha"
grep "\<ha\>" grep_test.txt
grep "ha" grep_test.txt

[root@hadoop ~]# grep 'i\{2\}' grep_test.txt
liuyiiiiiiiiiifei
[root@hadoop ~]# grep "i\{2\}" grep_test.txt
liuyiiiiiiiiiifei
[root@hadoop ~]# grep "i\{2,\}" grep_test.txt
liuyiiiiiiiiiifei
[root@hadoop ~]# grep "i\{2,5\}" grep_test.txt
liuyiiiiiiiiiifei

=======================================================
扩展正则表达式
使用 grep -E 或 egrep。ERE 增加了更多元字符，并且不需要对 {}, (), +, ?, | 进行转义
+   匹配前一个字符1次或多次   go+gle 匹配 "gogle", "google"，但不匹配 "ggle"
?   匹配前一个字符0次或1次   colou?r 匹配 "color" 和 "colour"
`   `   或，匹配多个模式之一   `apple   banana` 匹配 "apple" 或 "banana"
()   分组，将模式组合为一个单元   (abc)+ 匹配 "abc", "abcabc" 等
{}   区间，同 BRE，但不用转义   o{2,4} 匹配 2到4个 'o'

# 搜索 "error" 或 "warning"
grep -E "error|warning" logfile.txt

# 搜索 "s" 后跟 1 个或多个 "s"
grep -E "s+" file.txt

# 搜索 "colour" 或 "color"
grep -E "colou?r" file.txt

# 匹配 "abc" 重复一次或多次
grep -E "(abc)+" file.txt

Perl 兼容正则表达式
使用 grep -P。它提供了非常强大和复杂的模式匹配功能，如非贪婪匹配、lookaround 等
能   语法   说明
非贪婪匹配   .*?   匹配尽可能少的字符
正向先行断言   (?=...)   匹配后面是...的位置
负向先行断言   (?!...)   匹配后面不是...的位置
正向后行断言   (?<=...)   匹配前面是...的位置
负向后行断言   (?<!...)   匹配前面不是...的位置
\d   \d   匹配数字，等同于 [0-9]
\D   \D   匹配非数字，等同于 [^0-9]
\s   \s   匹配空白字符（空格、制表符等）
\S   \S   匹配非空白字符
\w   \w   匹配单词字符（字母、数字、下划线）
\W   \W   匹配非单词字符

# 匹配双引号内的内容（非贪婪）
echo '"hello" "world"' | grep -Po '".*?"'
# 输出："hello" 和 "world"

# 匹配后面不是 "abc" 的 "123"
echo "123abc 123def" | grep -Po "123(?!abc)"
# 输出：匹配 "123def" 中的 "123"

# 匹配前面是 "price: $" 的数字
echo "price: $100" | grep -Po "(?<=\$)\d+"
# 输出：100

# 使用 \d 匹配所有数字
grep -P "\d+" file.txt

综合实战示例
假设我们有一个文件 test.txt，内容如下：

hello world
error: file not found
Error: permission denied
warning: low disk space
user alice logged in
User bob logged out
the server is running
there is a cat
this is a test.
phone: 123-456-7890
email: user@example.com

忽略大小写，搜索所有 "error" 或 "warning" 开头的行，并显示行号
grep -in -E "^(error|warning)" test.txt
搜索所有不是 "user" 的行（忽略大小写）
grep -iv "user" test.txt
全字匹配 "the"
grep -w "the" test.txt
使用 PCRE 提取邮箱地址
grep -Po "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" test.txt
搜索包含美国电话号码格式的行
grep -E "[0-9]{3}-[0-9]{3}-[0-9]{4}" test.txt
# 或使用 PCRE
grep -P "\d{3}-\d{3}-\d{4}" test.txt
# 输出：phone: 123-456-7890
显示匹配行 "error" 及其后 2 行
grep -A 2 "error" test.txt

查看全文

http://www.dtcms.com/a/550471.html