|
本帖最后由 O'Reilly 于 2020-6-3 22:52 编辑
(1)grep命令的简介- C:\Users\86137>grep --help
- Usage: grep [OPTION]... PATTERN [FILE]...
- Search for PATTERN in each FILE or standard input.
- PATTERN is, by default, a basic regular expression (BRE).
- Example: grep -i 'hello world' menu.h main.c
- Regexp selection and interpretation:
- -E, --extended-regexp PATTERN is an extended regular expression (ERE)
- -F, --fixed-strings PATTERN is a set of newline-separated strings
- -G, --basic-regexp PATTERN is a basic regular expression (BRE)
- -P, --perl-regexp PATTERN is a Perl regular expression
- -e, --regexp=PATTERN use PATTERN for matching
- -f, --file=FILE obtain PATTERN from FILE
- -i, --ignore-case ignore case distinctions
- -w, --word-regexp force PATTERN to match only whole words
- -x, --line-regexp force PATTERN to match only whole lines
- -z, --null-data a data line ends in 0 byte, not newline
- Miscellaneous:
- -s, --no-messages suppress error messages
- -v, --invert-match select non-matching lines
- -V, --version display version information and exit
- --help display this help text and exit
- Output control:
- -m, --max-count=NUM stop after NUM matches
- -b, --byte-offset print the byte offset with output lines
- -n, --line-number print line number with output lines
- --line-buffered flush output on every line
- -H, --with-filename print the file name for each match
- -h, --no-filename suppress the file name prefix on output
- --label=LABEL use LABEL as the standard input file name prefix
- -o, --only-matching show only the part of a line matching PATTERN
- -q, --quiet, --silent suppress all normal output
- --binary-files=TYPE assume that binary files are TYPE;
- TYPE is 'binary', 'text', or 'without-match'
- -a, --text equivalent to --binary-files=text
- -I equivalent to --binary-files=without-match
- -d, --directories=ACTION how to handle directories;
- ACTION is 'read', 'recurse', or 'skip'
- -D, --devices=ACTION how to handle devices, FIFOs and sockets;
- ACTION is 'read' or 'skip'
- -r, --recursive like --directories=recurse
- -R, --dereference-recursive likewise, but follow all symlinks
- --include=FILE_PATTERN search only files that match FILE_PATTERN
- --exclude=FILE_PATTERN skip files and directories matching FILE_PATTERN
- --exclude-from=FILE skip files matching any file pattern from FILE
- --exclude-dir=PATTERN directories that match PATTERN will be skipped.
- -L, --files-without-match print only names of FILEs containing no match
- -l, --files-with-matches print only names of FILEs containing matches
- -c, --count print only a count of matching lines per FILE
- -T, --initial-tab make tabs line up (if needed)
- -Z, --null print 0 byte after FILE name
- Context control:
- -B, --before-context=NUM print NUM lines of leading context
- -A, --after-context=NUM print NUM lines of trailing context
- -C, --context=NUM print NUM lines of output context
- -NUM same as --context=NUM
- --color[=WHEN],
- --colour[=WHEN] use markers to highlight the matching strings;
- WHEN is 'always', 'never', or 'auto'
- -U, --binary do not strip CR characters at EOL (MSDOS/Windows)
- -u, --unix-byte-offsets report offsets as if CRs were not there
- (MSDOS/Windows)
- 'egrep' means 'grep -E'. 'fgrep' means 'grep -F'.
- Direct invocation as either 'egrep' or 'fgrep' is deprecated.
- When FILE is -, read standard input. With no FILE, read . if a command-line
- -r is given, - otherwise. If fewer than two FILEs are given, assume -h.
- Exit status is 0 if any line is selected, 1 otherwise;
- if any error occurs and -q is not given, the exit status is 2.
- Report bugs to: bug-grep@gnu.org
- GNU grep home page: <http://www.gnu.org/software/grep/>
- General help using GNU software: <http://www.gnu.org/gethelp/>
复制代码
(2)grep命令的示例
示例1
- C:\Users\86137\Desktop>type c.txt
- bat
- oracle
- shell
- plsql
- java
- python
- julia
- C:\Users\86137\Desktop>grep python *.txt
- c.txt:python
- C:\Users\86137\Desktop>grep -v python c.txt
- bat
- oracle
- shell
- plsql
- java
- julia
复制代码
示例2
- C:\Users\86137\Desktop\c>grep -r python *
- 新建文件夹/c.txt:python
复制代码
示例3- C:\Users\86137\Desktop>grep -i PYTHON c.txt
- Python
- C:\Users\86137\Desktop>grep -n "Python" c.txt
- 6:Python
复制代码
示例4
- C:\Users\86137\Desktop>grep -vc "Python" c.txt
- 6
- C:\Users\86137\Desktop>grep -c "Python" c.txt
- 1
复制代码
示例5
- C:\Users\86137\Desktop>grep -o "ython" c.txt
- ython
复制代码
示例6- C:\Users\86137\Desktop>grep -A 2 Python c.txt
- Python
- julia
- C:\Users\86137\Desktop>grep -B 2 Python c.txt
- plsql
- java
- Python
- C:\Users\86137\Desktop>grep -C 2 Python c.txt
- plsql
- java
- Python
- julia
复制代码
示例7
fast grep(fgrep)和extended regular expression(egrep)
- C:\Users\86137\Desktop>grep -F "Python" c.txt
- Python
- C:\Users\86137\Desktop>grep -E "Python" c.txt
- Python
复制代码
示例8
使用基本正则表达式
. :任意一个字符。
[abc] :表示匹配一个字符,这个字符必须是abc中的一个。
[a-zA-Z] :表示匹配一个字符,这个字符必须是a-z或A-Z这52个字母中的一个。
[^123] :匹配一个字符,这个字符是除了1、2、3以外的所有字符。
对于一些常用的字符集,系统做了定义:
[A-Za-z] 等价于 [[:alpha:]]
[0-9] 等价于 [[:digit:]]
[A-Za-z0-9] 等价于 [[:alnum:]]
tab,space 等价于空白字符 [[:space:]]
[A-Z] 等价于 [[:upper:]]
[a-z] 等价于 [[:lower:]]
标点符号 [[:punct:]]
- C:\Users\86137\Desktop>grep "Py.hon" c.txt
- Python=3
- C:\Users\86137\Desktop>grep "[[:upper:]]ython" c.txt
- Python=3
- C:\Users\86137\Desktop>grep "Python[^[:upper:]][[:digit:]]" c.txt
- Python=3
- C:\Users\86137\Desktop>grep "Pytho[a-z]" c.txt
- Python=3
- C:\Users\86137\Desktop>grep "Pytho[a-z][[:punct:]]" c.txt
- Python=3
复制代码 匹配次数:
{m,n\} :匹配其前面出现的字符至少m次,至多n次。
\? :匹配其前面出现的内容0次或1次,等价于\{0,1\}。
* :匹配其前面出现的内容任意次,等价于\{0,\},所以 ".*" 表述任意字符任意次,即无论什么内容全部匹配。
- C:\Users\86137\Desktop>grep "/.*sh" c.txt
- root:x:0:0:root:/root:/bin/bash
- shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
- sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
- C:\Users\86137\Desktop>grep "/.\{0,2\}sh" c.txt
- root:x:0:0:root:/root:/bin/bash
- shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
- sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
- C:\Users\86137\Desktop>grep -w ".\{0,2\}sh" c.txt
- root:x:0:0:root:/root:/bin/bash
复制代码
位置锚定
^ :锚定行首
$ :锚定行尾。技巧:"^$"用于匹配空白行。
\b或\<:锚定单词的词首。如"\blike"不会匹配alike,但是会匹配liker
\b或\>:锚定单词的词尾。如"\blike\b"不会匹配alike和liker,只会匹配like
\B :与\b作用相反。
- C:\Users\86137\Desktop>grep "h[ DISCUZ_CODE_19 ]quot; c.txt
- root:x:0:0:root:/root:/bin/bash
- C:\Users\86137\Desktop>grep "\<sh" c.txt
- shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
- C:\Users\86137\Desktop>grep "\Bsh\b" c.txt
- root:x:0:0:root:/root:/bin/bash
- C:\Users\86137\Desktop>grep "\bsh\b" c.txt
复制代码
分组及引用
\(string\) :将string作为一个整体方便后面引用
\1 :引用第1个左括号及其对应的右括号所匹配的内容。
\2 :引用第2个左括号及其对应的右括号所匹配的内容。
\n :引用第n个左括号及其对应的右括号所匹配的内容。
示例:
以相同字母开始结尾的行
- C:\Users\86137\Desktop>grep "^\([[:alpha:]]\).*\1[ DISCUZ_CODE_20 ]quot; c.txt
- nobody:x:99:99:Nobody:/:/sbin/nologin
- ntp:x:38:38::/etc/ntp:/sbin/nologin
- nscd:x:28:28:NSCD Daemon:/:/sbin/nologin
复制代码
扩展的(Extend)正则表达式(注意要使用扩展的正则表达式要加-E选项,或者直接使用egrep):
(1)匹配字符:这部分和基本正则表达式一样
(2)匹配次数:
* :和基本正则表达式一样
? :基本正则表达式是\?,二这里没有\。
{m,n} :相比基本正则表达式也是没有了\。
+ :匹配其前面的字符至少一次,相当于{1,}。
(3)位置锚定:和基本正则表达式一样。
(4)分组及引用:
(string) :相比基本正则表达式也是没有了\。
\1 :引用部分和基本正则表达式一样。
\n :引用部分和基本正则表达式一样。
(5)或者:
a|b :匹配a或b,注意a是指 | 的左边的整体,b也同理。比如 C|cat 表示的是 C或cat,而不是Cat或cat,如果要表示Cat或cat,则应该写为 (C|c)at 。记住(string)除了用于引用还用于分组。
注意
1:默认情况下,正则表达式的匹配工作在贪婪模式下,也就是说它会尽可能长地去匹配,比如某一行有字符串 abacb,如果搜索内容为 "a.*b" 那么会直接匹配 abacb这个串,而不会只匹配ab或acb。
2:所有的正则字符,如 [ 、* 、( 等,若要搜索 * ,而不是想把 * 解释为重复先前字符任意次,可以使用 \* 来转义。
示例
(1)检索0-255
- C:\Users\86137\Desktop>grep -E "[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]" c.txt
- IPADDR=127.0.0.1
- NETMASK=255.0.0.0
- NETWORK=127.0.0.0
- # If you're having problems with gated making 127.0.0.0/8 a martian,
- # you can change this to something else (255.255.255.255, for example)
- BROADCAST=127.255.255.255
复制代码 (2)检索由0-255的数字组合成IP
- C:\Users\86137\Desktop>grep -E "\b([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b" c.txt
- IPADDR=127.0.0.1
- NETMASK=255.0.0.0
- NETWORK=127.0.0.0
- # If you're having problems with gated making 127.0.0.0/8 a martian,
- # you can change this to something else (255.255.255.255, for example)
- BROADCAST=127.255.255.255
复制代码- C:\Users\86137\Desktop>grep -E "\b(([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b" c.txt
- IPADDR=127.0.0.1
- NETMASK=255.0.0.0
- NETWORK=127.0.0.0
- # If you're having problems with gated making 127.0.0.0/8 a martian,
- # you can change this to something else (255.255.255.255, for example)
- BROADCAST=127.255.255.255
复制代码
|
|