Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
从异常JSON中提取数据_Json_Csv_Hierarchical Data_Jq - Fatal编程技术网

从异常JSON中提取数据

从异常JSON中提取数据,json,csv,hierarchical-data,jq,Json,Csv,Hierarchical Data,Jq,有没有一种方法可以让下面的JSON代码变成一个漂亮的CSV { "cod:e1!!@23" : { "typeA" : { "lsk:d##fjd": { "title" : "slkdfjlkdjfd", "year" : "2014" }, "sdfdsfsd" : { "title" : "slkdfjlkdjfdd

有没有一种方法可以让下面的JSON代码变成一个漂亮的CSV

{
    "cod:e1!!@23" : {
        "typeA" : {
            "lsk:d##fjd": {
                "title" : "slkdfjlkdjfd",
                "year" : "2014"
            },
        "sdfdsfsd" : {
            "title" : "slkdfjlkdjfddewfsdfd",
            "year" : "2015"
            }
        },
        "Ct@ype" : {
            "sd$!!fs:$dfds" : {
                "title" : "slkdfjsdfsdfdsfsd",
                "year" : "2012"
            }
        }
    }
}
以下是我在jq的尝试:

jq -rc 'keys[] as $x 
  | .[]|keys[] as $y
  | .[]|keys[] as $z
  |.[]
  |[$x,$y,$z,.year] | @csv'

jq -rc 'keys_unsorted[] as $x
  | .[]|keys_unsorted[] as $y
  | .[]|keys_unsorted[] as $z
  | .[]|[$x,$y,$z,.year] | @csv'
但是输出是不正确的,因为如果有多个这样的记录,那么这些键就会被排序和排列。我还尝试了未排序的keys_,但没有解决问题

此时,修复原始JSON生成不是一个选项,因此任何帮助都将不胜感激

理想情况下,我会:

"cod:e1!!@23","typeA","lsk:d##fjd","slkdfjlkdjfd","2014"
"cod:e1!!@23","typeA","sdfdsfsd","slkdfjlkdjfddewfsdfd","2015"
"cod:e1!!@23","Ct@ype","sd$!!fs:$dfds","slkdfjsdfsdfdsfsd","2012"
您可以将“”与
展平
选项一起使用。这将生成像
cod:e1@23.typeA.lsk:d##fjd.title

cat input.json | json2csv -F >> output.csv

编辑:这不是您想要的

下面是一个jq脚本,它遍历输入中的“叶元素”,并将它所经过的每个键生成一个CSV列:

jq -r 'leaf_paths as $path | $path + [getpath($path)] | @csv'
请注意,这并不是您想要的:

"cod:e1!!@23","typeA","lsk:d##fjd","title","slkdfjlkdjfd"
"cod:e1!!@23","typeA","lsk:d##fjd","year","2014"
"cod:e1!!@23","typeA","sdfdsfsd","title","slkdfjlkdjfddewfsdfd"
"cod:e1!!@23","typeA","sdfdsfsd","year","2015"
"cod:e1!!@23","Ct@ype","sd$!!fs:$dfds","title","slkdfjsdfsdfdsfsd"
"cod:e1!!@23","Ct@ype","sd$!!fs:$dfds","year","2012"

对您在初始帖子中提供的脚本进行一个小的修改,就可以使其正常工作。我没有使用。[],而是通过
keys\u unsorted
中保存为变量的特定键进行索引。为了方便起见,我还向CSV添加了一个标题:

jq -r '["x", "y", "z", "title", "year"],
  (keys_unsorted[] as $x
   | .[$x] | keys_unsorted[] as $y
   | .[$y] | keys_unsorted[] as $z
   | .[$z] | [$x, $y, $z, .title, .year]) | @csv'
这确实提供了您要查找的输出(带有标题):


下面提供了一个常规结构的通用解决方案 嵌套对象(松散地说,它们可以被认为是“babushka对象”,如嵌套的玩偶);此外,对象中的键可以以任何方式排序

关键概念是“标量对象”——所有对象 键具有标量值

用于从“标量”中提取信息的模板 “对象”作为参数提供给“emit”过滤器并使用 确保生产时维持适当的订单 CSV行

def emit(template):

  def is_scalar_object:
    def is_scalar: type | ((. != "object") and (. != "array"));
    . as $in | (type == "object") and all($in[] | is_scalar);

  . as $in
  | paths as $path
  | select(getpath($path) | is_scalar_object)
  | $path + [ template + ($in | getpath($path)) | .[]]
  ;


data | emit( {title,  year} ) | @csv
用法:

 jq -r emit.jq input.json
输出:

"cod:e1!!@23","typeA","lsk:d##fjd","slkdfjlkdjfd","2014"
"cod:e1!!@23","typeA","sdfdsfsd","slkdfjlkdjfddewfsdfd","2015"
"cod:e1!!@23","Ct@ype","sd$!!fs:$dfds","slkdfjsdfsdfdsfsd","2012"

您能描述一下您的示例的“好”输出是什么吗?谢谢,更正了问题!JSON看起来有点不寻常,但非常好。没有什么“结构不好”。是的,“不寻常”更合适。感谢json2csv参考,输出不是我想要的,但该程序将来可能会很方便。正如你所写的,它与我想要的有点不同,但非常可行。我现在可以明白为什么我会得到所有这些排列!谢谢,基本上我应该重用这些变量,而不是启动一个新的
[]
分支。:)非常感谢!:)我已经接受了一个答案,并犹豫着改变它,但当然你的答案更一般。这将是方便的未来!
"cod:e1!!@23","typeA","lsk:d##fjd","slkdfjlkdjfd","2014"
"cod:e1!!@23","typeA","sdfdsfsd","slkdfjlkdjfddewfsdfd","2015"
"cod:e1!!@23","Ct@ype","sd$!!fs:$dfds","slkdfjsdfsdfdsfsd","2012"