如何使用ApacheSpark处理嵌套的JSON_Json_Apache Spark

如何使用ApacheSpark处理嵌套的JSON

json apache-spark

如何使用ApacheSpark处理嵌套的JSON,json,apache-spark,Json,Apache Spark,我有一个包含聚合和数组类型对象的复杂json文件。我想解析这个JSON文件，并可能被放入JSON文件的case类或类表示中我的JSON文件 [ { "organization": { "id": "897330837", /* Aggregate, array*/ "tradeStyleNames":[ { "name":"Letra Ativa", "priority":1 } ], //requir

我有一个包含聚合和数组类型对象的复杂json文件。我想解析这个JSON文件，并可能被放入JSON文件的case类或类表示中

我的JSON文件

[
{
  "organization": {
    "id": "897330837",

    /*  Aggregate, array*/
    "tradeStyleNames":[
      { "name":"Letra Ativa",
        "priority":1
      }
    ],

    //required Aggregate, array
    "telephone": [
      {
        "telephoneNumber":"9836673812",
        "isdCode":"US, '44' for the UK, '91",
        "isUnreachable":true

      },
      {
        "telephoneNumber":"909812345",
        "isdCode":"US, '44' for the UK, '91"

      }
    ],

    //Required  Aggregate, array or atomic
    "primaryAddress": [
        {"line1": "xyz"},
        {"line2": "xyz"},

    ]


}

},
{
  "organization": {
    "id": "1234",

    /*  Aggregate, array no data under this array*/

    "tradeStyleNames":[],


    //required Aggregate, array
    "telephone": [
      {
        "telephoneNumber":"9836673812"
        //isdCode and isUnreachable are not here

      },
    ],
    // Aggregate, array or atomic
    // instead of array only assign string type 
    **"primaryAddress": "abcd"**

}
}
]

-------------------------------------- 将apache spark 1.6与scala结合使用。Json文件大小将为30-40gg

请帮助我-我如何以最有效的方式处理此文件。

我也参考了下面的2个链接，但不确定-它在我的json文件中如何工作

您喜欢python还是scala？（或者我猜是SS、R或java）带scala的apache spark。。。。因为我必须将JSON对象存储到Habse中的非规范化表中