使用awk/sed提取json对象中的值,但无法使其正常工作

使用awk/sed提取json对象中的值,但无法使其正常工作,json,bash,awk,sed,Json,Bash,Awk,Sed,我有一个返回curl语句的文件,格式为json。每个对象都有一组值,但这些值的参数都称为相同的名称。请参阅下面的代码 这些对象是称为工作流的较大对象的一部分。清理对象是工作流中运行的最后一个进程。对于通过工作流的每个视频,都会创建此格式的json文件。(不止这三个对象,这只是为了说明) 我想获取对象的completed“description”:“Cleaning”的值,并将其存储为变量$end\u time。然后,我想获取对象的completed值,并将其作为变量$start\u time存储

我有一个返回curl语句的文件,格式为json。每个对象都有一组值,但这些值的参数都称为相同的名称。请参阅下面的代码

这些对象是称为
工作流
的较大对象的一部分。清理对象是工作流中运行的最后一个进程。对于通过工作流的每个视频,都会创建此格式的json文件。(不止这三个对象,这只是为了说明)

我想获取对象的
completed
“description”:“Cleaning”的值,并将其存储为变量
$end\u time
。然后,我想获取对象的
completed
值,并将其作为变量
$start\u time
存储。然后减去这两个值,得到一个整数时间(以毫秒为单位),这样我就可以计算视频完成这部分过程所需的时间。数学部分我很好,知道怎么做。我所挣扎的是价值观的提取

我希望这有意义?任何帮助都将不胜感激。提前谢谢你

编辑:由于字符限制,不得不删除帖子中的原始代码

以下是我必须处理的文件的一个适当示例:

{
    "workflows": {
        "count": "20", 
        "searchTime": "1", 
        "startPage": "0", 
        "totalCount": "1", 
        "workflow": {
            "configurations": {
                "configuration": [
                    {
                        "$": "1409750880000", 
                        "key": "schedule.start"
                    }, 
                    {
                        "$": "1409755980000", 
                        "key": "schedule.stop"
                    }, 
                    {
                        "$": "Capture_agent", 
                        "key": "schedule.location"
                    }, 
                    {
                        "$": "false", 
                        "key": "trimHold"
                    }, 
                    {
                        "$": "true", 
                        "key": "archiveOp"
                    }, 
                    {
                        "$": "false", 
                        "key": "captionHold"
                    }, 
                    {
                        "$": "false", 
                        "key": "videoPreview"
                    }
                ]
            }, 
            "creator": {
                "organization": "mh_default_org", 
                "roles": [
                    "76b1bdde-a080-40a4-b929-bde89af6a0a8_Instructor", 
                    "ROLE_ADMIN", 
                    "ROLE_ANONYMOUS", 
                    "ROLE_USER"
                ], 
                "userName": user_name
            }, 
            "description": "This workflow definition defines the steps involved in scheduling a recording, capturing it, and\n    ingesting it, after which processing operations may be added.\n  ", 
            "errors": "", 
            "id": "15518", 
            "mediapackage": {
                "attachments": "", 
                "creators": {
                    "creator": "Name"
                }, 
                "id": "2d25ed19-2978-458d-a4a0-c9c56d791c68", 
                "license": "Creative Commons 3.0: Attribution-NonCommercial-NoDerivs", 
                "media": "", 
                "metadata": "", 
                "publications": {
                    "publication": {
                        "channel": "engage-player", 
                        "id": "b7b68f91-2c33-4673-ba7c-2e9b891788f9", 
                        "mimetype": "text/html", 
                        "tags": "", 
                        "url": "http://some.url.com:80/engage/ui/watch.html?id=2d25ed19-2978-458d-a4a0-c9c56d791c68"
                    }
                }, 
                "series": "76b1bdde-a080-40a4-b929-bde89af6a0a8", 
                "seriestitle": "Recording_Title_user_name", 
                "start": "2014-09-03T13:28:00Z", 
                "title": "Recording_Title"
            }, 
            "operations": {
                "operation": [
                    {
                        "abortable": "false", 
                        "completed": 1409750882092, 
                        "configurations": {
                            "configuration": [
                                {
                                    "$": "1409750880000", 
                                    "key": "schedule.start"
                                }, 
                                {
                                    "$": "1409755980000", 
                                    "key": "schedule.stop"
                                }, 
                                {
                                    "$": "Capture_agent", 
                                    "key": "schedule.location"
                                }
                            ]
                        }, 
                        "continuable": "false", 
                        "description": "Scheduled", 
                        "execution-history": "", 
                        "execution-host": "http://some.url.com:8080", 
                        "fail-on-error": "true", 
                        "failed-attempts": "0", 
                        "hold-action-title": "View schedule", 
                        "holdurl": "/workflow/hold/org.opencastproject.workflow.handler.scheduleworkflowoperationhandler", 
                        "id": "schedule", 
                        "job": "15519", 
                        "max-attempts": "1", 
                        "retry-strategy": "none", 
                        "started": 1409750881745, 
                        "state": "SUCCEEDED", 
                        "time-in-queue": 0
                    }, 
                    {
                        "abortable": "false", 
                        "configurations": "", 
                        "continuable": "false", 
                        "description": "Capture", 
                        "execution-history": "", 
                        "execution-host": "http://some.url.com:8080", 
                        "fail-on-error": "true", 
                        "failed-attempts": "0", 
                        "hold-action-title": "Monitor capture", 
                        "holdurl": "/workflow/hold/org.opencastproject.workflow.handler.captureworkflowoperationhandler", 
                        "id": "capture", 
                        "job": "42894", 
                        "max-attempts": "1", 
                        "retry-strategy": "none", 
                        "started": 1409750884085, 
                        "state": "SKIPPED", 
                        "time-in-queue": 0
                    }, 
                    {
                        "completed": 1409756171224, 
                        "configurations": "", 
                        "description": "Ingest", 
                        "execution-history": "", 
                        "fail-on-error": "true", 
                        "failed-attempts": "0", 
                        "id": "ingest", 
                        "max-attempts": "1", 
                        "retry-strategy": "none", 
                        "state": "SUCCEEDED"
                    },                     
                    {
                        "completed": 1409854379552, 
                        "configurations": {
                            "configuration": {
                                "key": "preserve-flavors"
                            }
                        }, 
                        "description": "Cleaning up", 
                        "execution-history": "", 
                        "execution-host": "http://some.url.com:8080", 
                        "fail-on-error": "false", 
                        "failed-attempts": "0", 
                        "id": "cleanup", 
                        "job": "45113", 
                        "max-attempts": "1", 
                        "retry-strategy": "none", 
                        "started": 1409854378128, 
                        "state": "SUCCEEDED", 
                        "time-in-queue": 0
                    }
                ]
            }, 
            "organization": {
                "adminRole": "ROLE_ADMIN", 
                "anonymousRole": "ROLE_ANONYMOUS", 
                "id": "mh_default_org", 
                "name": "Opencast Project", 
                "properties": {
                    "property": [
                        {
                            "$": "true", 
                            "key": "adminui.i18n_tab_episode.enable"
                        }, 
                        {
                            "$": "false", 
                            "key": "adminui.i18n_tab_users.enable"
                        }, 
                        {
                            "$": "/engage/ui/img/mh_logos/OpencastLogo.png", 
                            "key": "logo_small"
                        }, 
                        {
                            "$": "http://opencast.org/matterhorn/", 
                            "key": "engageui.link_mobile_redirect.url"
                        }, 
                        {
                            "$": "false", 
                            "key": "engageui.annotations.enable"
                        }, 
                        {
                            "$": "true", 
                            "key": "engageui.links_media_module.enable"
                        }, 
                        {
                            "$": "2024", 
                            "key": "adminui.chunksize"
                        }, 
                        {
                            "$": "false", 
                            "key": "adminui.series_prepopulate.enable"
                        }, 
                        {
                            "$": "true", 
                            "key": "engageui.link_download.enable"
                        }, 
                        {
                            "$": "false", 
                            "key": "engageui.link_mobile_redirect.enable"
                        }, 
                        {
                            "$": "For more information have a look at the official site.", 
                            "key": "engageui.link_mobile_redirect.description"
                        }, 
                        {
                            "$": "/engage/ui/img/mh_logos/MatterhornLogo_large.png", 
                            "key": "logo_large"
                        }
                    ]
                }, 
                "servers": {
                    "server": {
                        "name": "localhost", 
                        "port": "8080"
                    }
                }
            }, 
            "parent": {
                "nil": "true"
            }, 
            "state": "SUCCEEDED", 
            "template": "full", 
            "title": "Scheduled Workflow"
        }
    }
}

下面是一个
jq
示例,该示例应指向您获得想要的:

#!/bin/bash
# Assuming the json is in a file workflow.json
end_time=$( jq '.workflows.workflow.operations.operation[] | select(.description == "Cleaning up") | .completed' < workflow.json )
start_time=$( jq '.workflows.workflow.operations.operation[] | select(.description == "Ingest") | .completed' < workflow.json )

尝试json解析器,而不是awk或sed。awk和sed用于正则表达式,正如您所发现的,您不能用正则表达式解析json结构。对于更复杂的结构,如XML和JSON,您需要使用Python或Perl,它们具有可以处理这些数据结构的模块。请查看相关帮助。非常感谢。我明天会测试这个,让你知道!非常感谢!我在这里有点挣扎。我已经试过你的命令,但我总是出错。命令:
end_time=$(jq).workflow[]| select(.description==“Cleaning up”)|.completed'
错误:
jq:Error:cannote iterate over null
我在原始帖子中添加了完整的json文件,因为此注释中没有足够的字符。我根据新的json更新了答案,但我在示例中没有看到“提取文本…”。如果你寻找这个,你会得到一个空白的结果。但是,“清理”示例使用您提供的JSON。需要记住的是,第一部分是您要查找的数组,而原始示例没有完整的路径。我会先在命令行上尝试
jq
命令,以验证它是否为您提供了所需的功能。谢谢!我为我的困惑道歉!我仍然在学习关于发布什么和不发布什么!再次抱歉!我明天会试试看,然后再给你回复!谢谢你的时间和努力!这在这个示例文件中非常有效!非常感谢。看到了我犯的错误,这是固定的。该文件是一个更大的json文件的示例。每当我在包含父级
“工作流”
中多个
“工作流”
实例的较大文件上运行此命令时,我都会收到以下错误:
jq:error:无法使用字符串为数组编制索引。谷歌搜索让我找到了一篇毫无帮助的帖子。我知道我必须调整这个命令,但不知道怎么做?如果这听起来很愚蠢,我很抱歉。
$ jq '.workflows.workflow.operations.operation[] | select(.description == "Ingest") | .completed' < workflow.json
1406051539118
$ jq '.workflows.workflow.operations.operation[] | select(.description == "Cleaning up") | .completed' < workflow.json
1406051695440