Linux 如何按名称而不是固定列从类似JSON的文本中提取字段？_Linux_Shell_Command Line_Sed_Cut

Linux 如何按名称而不是固定列从类似JSON的文本中提取字段？

linux shell command-line sed

Linux 如何按名称而不是固定列从类似JSON的文本中提取字段？,linux,shell,command-line,sed,cut,Linux,Shell,Command Line,Sed,Cut,我想逐行从文本文件中提取一个子字符串。我需要的信息在特定字段下。例如，我有以下文本： {name:x, version:1.0, info:"test", ...} {name:y, version:0.1, info:"test again", ...} {name:z, version:1.1, info:"test over", ...} 我尝试使用以下命令提取所有版本： cut -d',' -f 2 <file name> | cut -d':' -f 2 > <

我想逐行从文本文件中提取一个子字符串。我需要的信息在特定字段下。例如，我有以下文本：

{name:x, version:1.0, info:"test", ...}
{name:y, version:0.1, info:"test again", ...}
{name:z, version:1.1, info:"test over", ...}

我尝试使用以下命令提取所有版本：

cut -d',' -f 2 <file name> | cut -d':' -f 2 > <output>

上述命令将报告不正确的版本。我们有没有办法根据字段名而不是列来提取信息

预期结果：

1.0
0.1
1.1
1.2

对

-p

（PCRE Regex）和

-仅匹配选项使用GNUgrep
，您可以执行以下操作：
$ cat file
{name:x, version:1.0, info:"test", ...}
{name:y, version:0.1, info:"test again", ...}
{name:z, version:1.1, info:"test over", ...}
{name:x, info: "test", ..., version=1.2, ...}
$ grep -oP '(?<=version.)[^,}]+' file
1.0
0.1
1.1
1.2

$cat文件
{名称：x，版本：1.0，信息：“测试”，…}
{名称：y，版本：0.1，信息：“再次测试”，…}
{name:z，版本：1.1，信息：“测试结束”，…}
{name:x，info:“test”，…，version=1.2，…}
$grep-oP'（？使用此awk
：
awk -v f='version' -F ' *[{}:=,] *| +' '{for (i=2; i<=NF; i++) if ($(i-1)==f) 
   {print $i; break}}' file
1.0
0.1
1.1
1.2

awk-vf='version'-f'*[{}:=，]*.+''{for（i=2；i通过sed
$ sed 's/.*version:\([^,}]*\).*/\1/' file
1.0
0.1
1.1
1.2

又是塞德
sed 's/^.*version://; s/[,}].*//' < file

sed的/^.*版本：//；s/[，}].*/'

1.0

0.1

1.1

1.2
假设版本号只使用点和数字，没有内部值内容版本：
这对我来说很有效
[root@giam20 ~]# cut -f2 -d "," sample.txt | cut -f2 -d ":"
1.0
0.1
1.1

使用Grep和PCRE提取字段数据
如果您已安装pcregrep，或者您的grep已在支持下编译，则可以对所需字段进行grep。例如：
# grep with PCRE support
$ grep -Po 'version:\K[^,}]+' /tmp/corpus
1.0
0.1
1.1
1.2

# pcregrep doesn't need the -P flag
$ pcregrep -o 'version:\K[^,}]+' /tmp/corpus
1.0
0.1
1.1
1.2

无论哪种方式，您都可以通过查找版本字段开始匹配，用\K
丢弃所有已使用的字符，以便匹配只捕获字段数据，然后匹配除逗号或右大括号以外的任何内容。o标志告诉grep仅打印结果匹配，而不是整行
Grep中没有PCRE？只需使用Perl即可
如果您没有将Perl兼容的正则表达式（PCRE）编译到grep中，您仍然应该使用Perl本身，因为它是的一部分。使用Perl：
# NB: Avoid speed penalty for $& when perl > 5.10.0 && perl < 5.20.0.
# Use $& and remove the /p flag if you don't have (or need) the
# ${^MATCH} variable.
$ perl -ne 'print "${^MATCH}\n" if /version:\K[^,}]+/p' /tmp/corpus
1.0
0.1
1.1
1.2

# Use the $& special variable when ${^MATCH} isn't available, or when
# using a version without the speed penalty.
$ perl -ne 'print "$&\n" if /version:\K[^,}]+/' /tmp/corpus 
1.0
0.1
1.1
1.2

#注意：当perl>5.10.0&&perl<5.20.0时，避免$&的速度惩罚。
#使用$&并删除/p标志，如果您没有（或不需要）
#${^MATCH}变量。
$perl-ne'如果/version:\K[^，}]+/p'/tmp/corpus，则打印“${^MATCH}\n”
1
0.1
1.1
1.2
#当${^MATCH}不可用或
#使用没有速度惩罚的版本。
$perl-ne'print“$&\n”if/version:\K[^，}]+/'/tmp/corpus
1
0.1
1.1
1.2
这个perl
perl -nE 'say $3 if m/^\s*{ (([^"]|"[^"]*")*)* \bversion\s*:\s* ([\d.]*)/x' 

意志

不匹配引号内的版本：2.2

不匹配字符串，如oldversion:1.2


因此，对于以下输入：
{name: a, version: 1.1, info: "the version: 9.1 is better", oldversion: 0.1}
{name: b, version: 1.2, oldversion: 0.2, info: "the version: 9.2 is better"}
{name: c, info: "the version: 9.3 is better", version: 1.3, oldversion: 0.3}
{name: d, info: "the version: 9.4 is better", oldversion: 0.4, version: 1.4}

将打印
1.1
1.2
1.3
1.4

您是否有随机：
和=
作为字段分隔符输入？@eleven，显示预期结果？抱歉，我没有键入，所有答案都应该是“：“太好了，所有答案都有效。11个答案应该还有3个。也将与旧版本1.2
匹配；”@jm666不是语料库的一部分。始终存在边缘情况。如果需要进一步细化，可以在匹配的开始处添加\b
或\s，但这只是OP给出的示例的混乱。YMMV。
{name: a, version: 1.1, info: "the version: 9.1 is better", oldversion: 0.1}
{name: b, version: 1.2, oldversion: 0.2, info: "the version: 9.2 is better"}
{name: c, info: "the version: 9.3 is better", version: 1.3, oldversion: 0.3}
{name: d, info: "the version: 9.4 is better", oldversion: 0.4, version: 1.4}

1.1
1.2
1.3
1.4