Regex 带grep的正则表达式_Regex_Bash_Grep

Regex 带grep的正则表达式

regex bash grep

Regex 带grep的正则表达式,regex,bash,grep,Regex,Bash,Grep,所以我有一堆数据都是这样的： janitor#1/2 of dorm#1/1 president#4/1 of class#2/2 hunting#1/1 hat#1/2 side#1/2 of hotel#1/1 side#1/2 of hotel#1/1 king#1/2 of hotel#1/1 address#2/2 of girl#1/1 one#2/1 in family#2/2 dance#3/1 floor#1/2 movie#1/2 stars#5/1 movie#1/2 st

所以我有一堆数据都是这样的：

janitor#1/2 of dorm#1/1
president#4/1 of class#2/2
hunting#1/1 hat#1/2
side#1/2 of hotel#1/1
side#1/2 of hotel#1/1
king#1/2 of hotel#1/1
address#2/2 of girl#1/1
one#2/1 in family#2/2
dance#3/1 floor#1/2
movie#1/2 stars#5/1
movie#1/2 stars#5/1
insurance#1/1 office#1/2
side#1/1 of floor#1/2
middle#4/1 of December#1/2
movie#1/2 stars#5/1
one#2/1 of tables#2/2
people#1/2 at table#2/1

有些行有介词，有些行没有，所以我想我可以用正则表达式来清理它。我需要的是每一个名词、#符号和它自己行上的以下数字。例如，在最终文件中，输出的第一行应该如下所示：

janitor#1
dorm#1
president#4
etc...

该列表存储在名为NPs的文件中。我的代码是：

cat NPs | grep -E '\b(\w*[#][1-9]).' >> test

然而，当我打开测试时，它与输入文件完全相同。有没有关于我遗漏了什么的信息？这似乎不应该是一个困难的操作，所以也许我遗漏了一些语法方面的东西？我使用的是bash中调用的shell脚本中的这个命令

提前谢谢

grep变体从文本中提取整行，如果它们与模式匹配。如果需要修改行，应该使用

sed

，如

cat NPs | sed 's/^\(\b\w*[#][1-9]\).*$/\1/g'

您需要的是

sed

，而不是

grep

。（或

awk

，或

perl

）它看起来可以满足您的要求：

cat NPs | sed 's?/.*??'

或者干脆

sed 's?/.*??' NPs

表示“替换”。下一个字符是正则表达式之间的分隔符。通常是“/”，但因为你需要搜索“/”，所以我用“？”代替。“.”表示任何字符，“*”表示“前面的内容为零或更多”。最后两个分隔符之间的任何内容都是替换字符串。在本例中，它是空的，因此您将用空字符串替换“/”后跟零个或多个字符

编辑：哦，我现在明白了，你也想提取行中的最后一项。嗯，我相信其他人建议的regexp会起作用。如果这是我的问题，我可能会分两步过滤文件，可能会将结果从一步传输到下一步，或者使用多个替换

sed

：首先删除“of”和中间空格，添加换行符，然后按上述方式运行

sed

。它不像在一个regexp中完成所有操作那么酷，但每一步都更容易理解。为了更加简单和不酷，请使用三个步骤，在第一步中将“of”替换为空格。因为其他人提供了完整的解决方案，所以我不会计算细节。

Grep默认情况下只搜索文本，所以在您的情况下，它会打印匹配的行。我想您应该调查

sed

以执行替换。（您不需要

cat

文件，只需

grep模式文件名

）

为了在单独的行中获得输出，这对我来说很有用：

sed 's|/.||g' NPs | sed 's/ .. /=/' | tr "=" "\n"

这将使用一行中的两个SED进行不同的替换，并使用

tr

插入换行符

grep中的

-o

选项使它只打印匹配的文本，如另一个答案中所述，可能更简单

这应该可以满足您的需要

-o

选项将仅显示匹配行中与模式匹配的部分

grep -Eo '[a-z#]+[1-9]' NPs > test

甚至是

-p

选项，它将模式解释为Perl正则表达式

grep -Po '[\w#]*(?=/)' NPs > test

使用

grep

：

$ grep -o "\w*[#]\w*" inputfile
janitor#1
dorm#1
president#4
class#2
hunting#1
hat#1
side#1
hotel#1
side#1
hotel#1
king#1
hotel#1
address#2
girl#1
one#2
family#2
dance#3
floor#1
movie#1
stars#5
movie#1
stars#5
insurance#1
office#1
side#1
floor#1
middle#4
ecember#1
movie#1
stars#5
one#2
tables#2
people#1
table#2

awk

版本：

awk '/#/ {print $NF}' RS="/" NPs
janitor#1
dorm#1
president#4
class#2
hunting#1
hat#1
side#1
hotel#1
side#1
hotel#1
king#1
hotel#1
address#2
girl#1
one#2
family#2
dance#3
floor#1
movie#1
stars#5
movie#1
stars#5
insurance#1
office#1
side#1
floor#1
middle#4
December#1
movie#1
stars#5
one#2
tables#2
people#1
table#2

很好，这是一个比我想象的更清晰的解决方案。谢谢