Python 在shell脚本中从表中的最后一行提取值
我有一个包含以下内容的文件(data.txt)。它有多行,由Python 在shell脚本中从表中的最后一行提取值,python,linux,bash,shell,perl,Python,Linux,Bash,Shell,Perl,我有一个包含以下内容的文件(data.txt)。它有多行,由-序列分隔。它看起来像一个放在文件中的图形表。在下面的文件中,第一行包含所有列名,所有其他行是所有这些列的实际数据 Connecting to the ControlService endpoint Found 3 rows. Requests List: ------------------------------------------------------------------------------------------
-
序列分隔。它看起来像一个放在文件中的图形表。在下面的文件中,第一行包含所有列名,所有其他行是所有这些列的实际数据
Connecting to the ControlService endpoint
Found 3 rows.
Requests List:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Client ID | Client Type | Service Type | Status | Trust Domain | Data Instance Name | Data Version | Creation Time | Last Update | Scheduled Time |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
REFRESH_ROUTINGTIER_ARTIFACTS_1465901168866 | ROUTINGTIER_ARTIFACTS | SYSTEM | COMPLETED | RRA Bulk Client | soa_server1 | 18.2.2.0.0 | 2016-06-14 03:49:55 -07:00 | 2016-06-14 03:49:57 -07:00 | --- |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
500333443 | CREATE | [FA_GSI] | COMPLETED | holder | soa_server1 | 18.3.2.0.0 | 2018-08-07 11:59:57 -07:00 | 2018-08-07 12:04:37 -07:00 | --- |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
500333446 | CREATE | [FA_GSI] | COMPLETED | holder-test | soa_server1 | 18.3.2.0.0 | 2018-08-07 12:04:48 -07:00 | 2018-08-07 12:08:52 -07:00 | --- |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
现在我想解析上面的文件并从最后一行提取值。我想提取最后一行中“客户端ID”和“信任域”列的值,即:
Client ID: 500333446
Trust Domain: holder-test
这可以在shell脚本、perl或python中实现吗?是的,可以在python中实现。我建议使用csv模块并将分隔符自定义为“|”
import csv
with open('s', 'r') as f:
reader = csv.reader(f, delimiter='|')
for row in reader:
print(row)
提供以下列表:
['Connecting to the ControlService endpoint']
[]
['Found 3 rows.']
['Requests List:']
['-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------']
[' Client ID ', ' Client Type ', ' Service Type ', ' Status ', ' Trust Domain ', ' Data Instance Name ', ' Data Version ', ' Creation Time ', ' Last Update ', ' Scheduled Time ', ' ']
['-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------']
[' REFRESH_ROUTINGTIER_ARTIFACTS_1465901168866 ', ' ROUTINGTIER_ARTIFACTS ', ' SYSTEM ', ' COMPLETED ', ' RRA Bulk Client ', ' soa_server1 ', ' 18.2.2.0.0 ', ' 2016-06-14 03:49:55 -07:00 ', ' 2016-06-14 03:49:57 -07:00 ', ' --- ', ' ']
['-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------']
[' 500333443 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 11:59:57 -07:00 ', ' 2018-08-07 12:04:37 -07:00 ', ' --- ', ' ']
['-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------']
[' 500333446 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder-test ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 12:04:48 -07:00 ', ' 2018-08-07 12:08:52 -07:00 ', ' --- ', ' ']
['-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------']
您可以轻松跳过结果列表中的前4行。由@paragbaxi提供的解决方案很好,我只需添加一个条件来过滤掉仅包含“----”的行。像这样:
import csv
lines_to_skip = 4
with open('data.csv', 'r') as f:
reader = csv.reader(f, delimiter='|')
for i in range(lines_to_skip):
next(reader) #Skipping lines
data = []
for line in reader:
if line[0].find("---") != 0: #Check what position has symbol "---" if 0 then skip
print(line)
data.append(line)
print("Last row:\n{}".format(data[-1]))
print("Client ID:{} Domain:{}".format(data[-1][0].replace(" ",""),data[-1][4].replace(" ",""))) #replace() just removes unnecessary spaces
输出:
[' Client ID ', ' Client Type ', ' Service Type ', ' Status ', ' Trust Domain ', ' Data Instance Name ', ' Data Version ', ' Creation Time ', ' Last Update ', ' Scheduled Time ', ' ']
[' REFRESH_ROUTINGTIER_ARTIFACTS_1465901168866 ', ' ROUTINGTIER_ARTIFACTS ', ' SYSTEM ', ' COMPLETED ', ' RRA Bulk Client ', ' soa_server1 ', ' 18.2.2.0.0 ', ' 2016-06-14 03:49:55 -07:00 ', ' 2016-06-14 03:49:57 -07:00 ', ' --- ', ' ']
[' 500333443 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 11:59:57 -07:00 ', ' 2018-08-07 12:04:37 -07:00 ', ' --- ', ' ']
[' 500333446 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder-test ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 12:04:48 -07:00 ', ' 2018-08-07 12:08:52 -07:00 ', ' --- ', ' ']
Last row:
[' 500333446 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder-test ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 12:04:48 -07:00 ', ' 2018-08-07 12:08:52 -07:00 ', ' --- ', ' ']
Client ID:500333446 Domain:holder-test
Process finished with exit code 0
Client ID: 500333446
Trust Domain: holder-test
awk中的一个:
awk 'BEGIN{FS="|"}!/^-+/{c=$1;t=$5}END{print "Client ID:" c ORS "Trust Domain:" t}' file
解释:
$ awk '
BEGIN { FS="|" } # pipe-separator
!/^-+/ { # process if doesnt start with dashes
c=$1 # client value
t=$5 # trust domain value
}
END { # in the end
print "Client ID:" c ORS "Trust Domain:" t # output the last value pair
}' file
输出:
[' Client ID ', ' Client Type ', ' Service Type ', ' Status ', ' Trust Domain ', ' Data Instance Name ', ' Data Version ', ' Creation Time ', ' Last Update ', ' Scheduled Time ', ' ']
[' REFRESH_ROUTINGTIER_ARTIFACTS_1465901168866 ', ' ROUTINGTIER_ARTIFACTS ', ' SYSTEM ', ' COMPLETED ', ' RRA Bulk Client ', ' soa_server1 ', ' 18.2.2.0.0 ', ' 2016-06-14 03:49:55 -07:00 ', ' 2016-06-14 03:49:57 -07:00 ', ' --- ', ' ']
[' 500333443 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 11:59:57 -07:00 ', ' 2018-08-07 12:04:37 -07:00 ', ' --- ', ' ']
[' 500333446 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder-test ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 12:04:48 -07:00 ', ' 2018-08-07 12:08:52 -07:00 ', ' --- ', ' ']
Last row:
[' 500333446 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder-test ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 12:04:48 -07:00 ', ' 2018-08-07 12:08:52 -07:00 ', ' --- ', ' ']
Client ID:500333446 Domain:holder-test
Process finished with exit code 0
Client ID: 500333446
Trust Domain: holder-test
是的,可以用python实现。我建议使用csv模块,并将分隔符自定义为“|”。如果问题已解决,请将其中一个答案标记为正确。请避免“给我代码”问题。而是显示您正在处理的脚本,并说明问题所在。也看到了,是的,这很有意义,因为我可以看到最后一排是什么。现在,如何从最后一行提取
客户端ID
和信任域的值?基本上,它应该给我这些500333446
和holder test
@flash是的,检查我编辑的答案,这个片段。替换(“,”)
替换空格,使数据看起来更紧凑。你也可以使用子字符串删除它们。这很好。。谢谢你的帮助。。另外,如果我只需要打印域值而不需要它的任何键,那么它将是这样的print(“.”。format(data[-1][4]。replace(“,”)
。我试过了,它什么也没印出来。基本上我只想打印holder test
仅此而已。您使用的打印不正确,应该是这样的print(“{}”。格式化(数据[-1][4]。替换(“,”)
在字符串中插入值的位置是的,{}
是一个占位符,我在再次检查后修复了它。