将大型JSON转换为CSV--Python_Python_Json_Csv

将大型JSON转换为CSV--Python

python json csv

将大型JSON转换为CSV--Python,python,json,csv,Python,Json,Csv,我正在尝试将.json文件转换为.csv文件，以便可以在R中执行分析。我按照建议的步骤操作，但仍然遇到问题（可能是由于json文件太大）。首先，我从web中提取URL： import urllib #first open html as .json response = urllib.request.urlopen("http://trends.vera.org/data/county/1003/") input = response.read() print(input) 下面这个函数是我

我正在尝试将.json文件转换为.csv文件，以便可以在R中执行分析。我按照建议的步骤操作，但仍然遇到问题（可能是由于json文件太大）。首先，我从web中提取URL：

import urllib

#first open html as .json
response = urllib.request.urlopen("http://trends.vera.org/data/county/1003/")
input = response.read()
print(input)

下面这个函数是我从链接的问题中得到的，用于展平json文件

#function to flatten .json file
def flattenjson( b, delim ):
    val = {}
    for i in b.keys():
        if isinstance( b[i], dict ):
            get = flattenjson( b[i], delim )
            for j in get.keys():
                val[ i + delim + j ] = get[j]
        else:
            val[i] = b[i]

    return val

下面的行接收列表并生成csv的列名这就是问题所在。有人知道如何解决这个问题吗

#find column names
input = map( lambda x: flattenjson( x ), input )
columns = map( lambda x: x.keys(), input )
columns = reduce( lambda x,y: x+y, columns )
columns = list( set( columns ) )
print(columns)

最后，我将json数据写入.csv文件

#write to .csv file
with open( fname, 'wb' ) as out_file:
    csv_w = csv.writer( out_file )
    csv_w.writerow( columns )

    for i_r in input:
        csv_w.writerow( map( lambda x: i_r.get( x, "" ), columns ) )

提前感谢您的帮助。

首先，您需要解码响应。始终对http请求使用库。它可以解码json

import requests
response = requests.get("http://trends.vera.org/data/county/1003/")
data = response.json()

您的第二部分有另一个错误。FlattJSON需要2个agruments，而您只提供一个。第二个是CSV文件中的分隔符。此代码适用于：

print(flattenjson(data, ';'))

如果不需要所有数据，可以指定精确的密钥：

flattenjson(data['yearlyData'], ';').

在R中这样做要容易得多。该列表中只有一项有表格数据，所有数据都是数字。但是它也有一些有趣的格式，因此需要

grab\u column（）

函数<代码>结果包含表格格式的数据

library(rjson)    

tmp <- rjson::fromJSON(file = "http://trends.vera.org/data/county/1003/") 

grab_column <- function(x) {
  tmp <- as.character(x)
  if (length(tmp) == 0) tmp <- NA
  else tmp[tmp == "NULL"] <- NA
  as.numeric(tmp)
}

Result <- as.data.frame(lapply(foo, FUN = grab_column))
Year <- data.frame(year = as.numeric(names(foo[[1]])))
Result <- cbind(Year, Result)

库（rjson）
tmp如果你遇到错误，你应该在你的问题中发布它。另外一个建议是：尝试在R中使用JSON。它比CSV更方便的数据表示格式。你是对的。事实证明，在R中完成这项工作要容易得多。