Json 在同一数组上获取所有结果
我已经挣扎了几个小时,我很确定我缺少了一些东西 考虑到这一点:Json 在同一数组上获取所有结果,json,group-by,jq,Json,Group By,Jq,我已经挣扎了几个小时,我很确定我缺少了一些东西 考虑到这一点: [ { "LAST_JOB_POD":"gitlab-web-65-gwwwh", "STARTED_AT":"31-05-2018-18:18:48", "FINISHED":"false", "FIRST_INDEXED":"0", "LAST_INDEXED":"3143", "failed_projects":{ "1082": "4:Deadline Exceeded, trace",
[
{
"LAST_JOB_POD":"gitlab-web-65-gwwwh",
"STARTED_AT":"31-05-2018-18:18:48",
"FINISHED":"false",
"FIRST_INDEXED":"0",
"LAST_INDEXED":"3143",
"failed_projects":{
"1082": "4:Deadline Exceeded, trace",
"1273": "/opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/elasticsearch-transport-5.0.3/lib/elasticsearch/transport/transport/base.rb:201:in `__raise_transport_error'",
"2492": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"3060": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)"
}
},
{
"LAST_JOB_POD":"gitlab-web-65-gwwwh",
"STARTED_AT":"31-05-2018-18:18:48",
"FINISHED":"false",
"FIRST_INDEXED":"0",
"LAST_INDEXED":"3143",
"failed_projects":{
"5570": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"6103": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"6188": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"6695": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"6721": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"6728": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"6747": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)"
}
},
{
"LAST_JOB_POD":"gitlab-web-65-gwwwh",
"STARTED_AT":"31-05-2018-18:18:48",
"FINISHED":"false",
"FIRST_INDEXED":"0",
"LAST_INDEXED":"3143",
"failed_projects":{
"6760": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"6939": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"6941": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"6942": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"6947": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)",
"7201": "/opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/elasticsearch-transport-5.0.3/lib/elasticsearch/transport/transport/base.rb:201:in `__raise_transport_error'",
"7707": ", trace - [\"/opt/gitlab/embedded/service/gitlab-rails/ee/lib/gitlab/elastic/indexer.rb:64:in `run_indexer!'\"",
"7787": "/opt/gitlab/embedded/lib/ruby/2.3.0/net/protocol.rb:176:in `rbuf_fill': Net::ReadTimeout (Faraday::TimeoutError)"
}
}
]
我目前正在使用jq
提取失败的\u项目
条目,但是
[]|选择(.failed_projects!=null)|。作为$object{“failed_projects”}[]
我在不同的小组中得到了结果:
{
"1082": "...",
...
}
{
"5570": "...",
...
}
{
"6760": "...",
...
}
我试图完成的是将ID与相同的异常分组。例如:
[{
"Exception": "ReadTimeout",
[{
"ID": 2492,
"ID": 3060
}]
},
{
"Exception": "Deadline Exceeded",
[{
"ID": 1082
}]
}]
说明性输出作为JSON无效,并且具有重复键的对象,这可能不是您实际想要的,但是下面的jq程序将生成符合一般问题描述的输出。由于您似乎没有指定精确的分组标准,因此我将最后一个“:”之后的错误消息文本作为分组标准。(例如,如果您想在第一个“:”之后考虑文本,使用“^ [^:] *:*”作为正则表达式) 第一步将
.failed_项目
收集在一起,并将应用于_条目
,以便我们可以轻松访问ID和错误消息文本:
[.[] | .failed_projects | to_entries[]]
接下来,我们提取分组标准,并使用它来形成组:
| map(.value |= sub("^.*: *";""))
| group_by(.value)
最后,我们将这些组转换为以下形式的JSON对象:
{GROUP:ARRAY_OF_id}
| map( .[0].value as $key
| [.[] | .key] as $value
| {($key): $value} )
将上述片段放在文件program.jq中,并使用调用:
jq -f program.jq input.json
产生如下所示的输出。显然,您需要修改分组条件。您可能还希望将ID字符串转换为JSON数字,这可以通过tonumber
或更谨慎地完成
通过(tonumber?/)
为了理解program.jq,您可能希望从第一个片段开始,然后依次添加其他片段
输出
[].failed_projects | select(.!=null)
是您现在拥有的更简单的版本。不清楚您想要什么作为最终输出。是否要将所有对象合并为一个对象?@chepner我用所需的输出更新了问题,并按值对巧合进行分组。任何分割值与ID一致的都可以。示例性输出作为JSON无效,并且包含具有重复键的对象。在没有这些对象的情况下生成有效的JSON作为输出是否可以接受?如果是的话,您能提供一个所需JSON输出的示例吗?
[
{
"Deadline Exceeded, trace": [
"1082"
]
},
{
"TimeoutError)": [
"6728",
"6747",
"6939",
"5570",
"6103",
"6188",
"6695",
"6721",
"2492",
"6760",
"3060",
"6941",
"6942",
"6947",
"7787"
]
},
{
"in `__raise_transport_error'": [
"1273",
"7201"
]
},
{
"in `run_indexer!'\"": [
"7707"
]
}
]