使用awk/sed提取json对象中的值,但无法使其正常工作
我有一个返回curl语句的文件,格式为json。每个对象都有一组值,但这些值的参数都称为相同的名称。请参阅下面的代码 这些对象是称为使用awk/sed提取json对象中的值,但无法使其正常工作,json,bash,awk,sed,Json,Bash,Awk,Sed,我有一个返回curl语句的文件,格式为json。每个对象都有一组值,但这些值的参数都称为相同的名称。请参阅下面的代码 这些对象是称为工作流的较大对象的一部分。清理对象是工作流中运行的最后一个进程。对于通过工作流的每个视频,都会创建此格式的json文件。(不止这三个对象,这只是为了说明) 我想获取对象的completed“description”:“Cleaning”的值,并将其存储为变量$end\u time。然后,我想获取对象的completed值,并将其作为变量$start\u time存储
工作流
的较大对象的一部分。清理对象是工作流中运行的最后一个进程。对于通过工作流的每个视频,都会创建此格式的json文件。(不止这三个对象,这只是为了说明)
我想获取对象的completed
“description”:“Cleaning”的值,并将其存储为变量$end\u time
。然后,我想获取对象的completed
值,并将其作为变量$start\u time
存储。然后减去这两个值,得到一个整数时间(以毫秒为单位),这样我就可以计算视频完成这部分过程所需的时间。数学部分我很好,知道怎么做。我所挣扎的是价值观的提取
我希望这有意义?任何帮助都将不胜感激。提前谢谢你
编辑:由于字符限制,不得不删除帖子中的原始代码
以下是我必须处理的文件的一个适当示例:
{
"workflows": {
"count": "20",
"searchTime": "1",
"startPage": "0",
"totalCount": "1",
"workflow": {
"configurations": {
"configuration": [
{
"$": "1409750880000",
"key": "schedule.start"
},
{
"$": "1409755980000",
"key": "schedule.stop"
},
{
"$": "Capture_agent",
"key": "schedule.location"
},
{
"$": "false",
"key": "trimHold"
},
{
"$": "true",
"key": "archiveOp"
},
{
"$": "false",
"key": "captionHold"
},
{
"$": "false",
"key": "videoPreview"
}
]
},
"creator": {
"organization": "mh_default_org",
"roles": [
"76b1bdde-a080-40a4-b929-bde89af6a0a8_Instructor",
"ROLE_ADMIN",
"ROLE_ANONYMOUS",
"ROLE_USER"
],
"userName": user_name
},
"description": "This workflow definition defines the steps involved in scheduling a recording, capturing it, and\n ingesting it, after which processing operations may be added.\n ",
"errors": "",
"id": "15518",
"mediapackage": {
"attachments": "",
"creators": {
"creator": "Name"
},
"id": "2d25ed19-2978-458d-a4a0-c9c56d791c68",
"license": "Creative Commons 3.0: Attribution-NonCommercial-NoDerivs",
"media": "",
"metadata": "",
"publications": {
"publication": {
"channel": "engage-player",
"id": "b7b68f91-2c33-4673-ba7c-2e9b891788f9",
"mimetype": "text/html",
"tags": "",
"url": "http://some.url.com:80/engage/ui/watch.html?id=2d25ed19-2978-458d-a4a0-c9c56d791c68"
}
},
"series": "76b1bdde-a080-40a4-b929-bde89af6a0a8",
"seriestitle": "Recording_Title_user_name",
"start": "2014-09-03T13:28:00Z",
"title": "Recording_Title"
},
"operations": {
"operation": [
{
"abortable": "false",
"completed": 1409750882092,
"configurations": {
"configuration": [
{
"$": "1409750880000",
"key": "schedule.start"
},
{
"$": "1409755980000",
"key": "schedule.stop"
},
{
"$": "Capture_agent",
"key": "schedule.location"
}
]
},
"continuable": "false",
"description": "Scheduled",
"execution-history": "",
"execution-host": "http://some.url.com:8080",
"fail-on-error": "true",
"failed-attempts": "0",
"hold-action-title": "View schedule",
"holdurl": "/workflow/hold/org.opencastproject.workflow.handler.scheduleworkflowoperationhandler",
"id": "schedule",
"job": "15519",
"max-attempts": "1",
"retry-strategy": "none",
"started": 1409750881745,
"state": "SUCCEEDED",
"time-in-queue": 0
},
{
"abortable": "false",
"configurations": "",
"continuable": "false",
"description": "Capture",
"execution-history": "",
"execution-host": "http://some.url.com:8080",
"fail-on-error": "true",
"failed-attempts": "0",
"hold-action-title": "Monitor capture",
"holdurl": "/workflow/hold/org.opencastproject.workflow.handler.captureworkflowoperationhandler",
"id": "capture",
"job": "42894",
"max-attempts": "1",
"retry-strategy": "none",
"started": 1409750884085,
"state": "SKIPPED",
"time-in-queue": 0
},
{
"completed": 1409756171224,
"configurations": "",
"description": "Ingest",
"execution-history": "",
"fail-on-error": "true",
"failed-attempts": "0",
"id": "ingest",
"max-attempts": "1",
"retry-strategy": "none",
"state": "SUCCEEDED"
},
{
"completed": 1409854379552,
"configurations": {
"configuration": {
"key": "preserve-flavors"
}
},
"description": "Cleaning up",
"execution-history": "",
"execution-host": "http://some.url.com:8080",
"fail-on-error": "false",
"failed-attempts": "0",
"id": "cleanup",
"job": "45113",
"max-attempts": "1",
"retry-strategy": "none",
"started": 1409854378128,
"state": "SUCCEEDED",
"time-in-queue": 0
}
]
},
"organization": {
"adminRole": "ROLE_ADMIN",
"anonymousRole": "ROLE_ANONYMOUS",
"id": "mh_default_org",
"name": "Opencast Project",
"properties": {
"property": [
{
"$": "true",
"key": "adminui.i18n_tab_episode.enable"
},
{
"$": "false",
"key": "adminui.i18n_tab_users.enable"
},
{
"$": "/engage/ui/img/mh_logos/OpencastLogo.png",
"key": "logo_small"
},
{
"$": "http://opencast.org/matterhorn/",
"key": "engageui.link_mobile_redirect.url"
},
{
"$": "false",
"key": "engageui.annotations.enable"
},
{
"$": "true",
"key": "engageui.links_media_module.enable"
},
{
"$": "2024",
"key": "adminui.chunksize"
},
{
"$": "false",
"key": "adminui.series_prepopulate.enable"
},
{
"$": "true",
"key": "engageui.link_download.enable"
},
{
"$": "false",
"key": "engageui.link_mobile_redirect.enable"
},
{
"$": "For more information have a look at the official site.",
"key": "engageui.link_mobile_redirect.description"
},
{
"$": "/engage/ui/img/mh_logos/MatterhornLogo_large.png",
"key": "logo_large"
}
]
},
"servers": {
"server": {
"name": "localhost",
"port": "8080"
}
}
},
"parent": {
"nil": "true"
},
"state": "SUCCEEDED",
"template": "full",
"title": "Scheduled Workflow"
}
}
}
下面是一个
jq
示例,该示例应指向您获得想要的:
#!/bin/bash
# Assuming the json is in a file workflow.json
end_time=$( jq '.workflows.workflow.operations.operation[] | select(.description == "Cleaning up") | .completed' < workflow.json )
start_time=$( jq '.workflows.workflow.operations.operation[] | select(.description == "Ingest") | .completed' < workflow.json )
尝试json解析器,而不是awk或sed。awk和sed用于正则表达式,正如您所发现的,您不能用正则表达式解析json结构。对于更复杂的结构,如XML和JSON,您需要使用Python或Perl,它们具有可以处理这些数据结构的模块。请查看相关帮助。非常感谢。我明天会测试这个,让你知道!非常感谢!我在这里有点挣扎。我已经试过你的命令,但我总是出错。命令:
end_time=$(jq).workflow[]| select(.description==“Cleaning up”)|.completed'
错误:jq:Error:cannote iterate over null
我在原始帖子中添加了完整的json文件,因为此注释中没有足够的字符。我根据新的json更新了答案,但我在示例中没有看到“提取文本…”。如果你寻找这个,你会得到一个空白的结果。但是,“清理”示例使用您提供的JSON。需要记住的是,第一部分是您要查找的数组,而原始示例没有完整的路径。我会先在命令行上尝试jq
命令,以验证它是否为您提供了所需的功能。谢谢!我为我的困惑道歉!我仍然在学习关于发布什么和不发布什么!再次抱歉!我明天会试试看,然后再给你回复!谢谢你的时间和努力!这在这个示例文件中非常有效!非常感谢。看到了我犯的错误,这是固定的。该文件是一个更大的json文件的示例。每当我在包含父级“工作流”
中多个“工作流”
实例的较大文件上运行此命令时,我都会收到以下错误:jq:error:无法使用字符串为数组编制索引。谷歌搜索让我找到了一篇毫无帮助的帖子。我知道我必须调整这个命令,但不知道怎么做?如果这听起来很愚蠢,我很抱歉。
$ jq '.workflows.workflow.operations.operation[] | select(.description == "Ingest") | .completed' < workflow.json
1406051539118
$ jq '.workflows.workflow.operations.operation[] | select(.description == "Cleaning up") | .completed' < workflow.json
1406051695440