Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/318.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何使用Python将列表中的嵌套json存储到文本文件中?_Python_Json_Python 3.x_Python Jsonschema - Fatal编程技术网

如何使用Python将列表中的嵌套json存储到文本文件中?

如何使用Python将列表中的嵌套json存储到文本文件中?,python,json,python-3.x,python-jsonschema,Python,Json,Python 3.x,Python Jsonschema,我正在创建一个嵌套的json,并将其存储在一个列表对象中。下面是我的代码,它按照预期获得了正确的层次化json 样本数据: 数据源,数据源,类别,类别,子类别,子类别 劳工统计局,44,就业和工资,44,就业和工资,44 import pandas as pd df=pd.read_csv('queryhive16273.csv') def split_df(df): for (vendor, count), df_vendor in df.groupby(["datasource", "

我正在创建一个嵌套的json,并将其存储在一个列表对象中。下面是我的代码,它按照预期获得了正确的层次化json

样本数据:

数据源,数据源,类别,类别,子类别,子类别 劳工统计局,44,就业和工资,44,就业和工资,44

import pandas as pd
df=pd.read_csv('queryhive16273.csv')
def split_df(df):
   for (vendor, count), df_vendor in df.groupby(["datasource", "datasource_cnt"]):
       yield {
           "vendor_name": vendor,
           "count": count,
           "categories": list(split_category(df_vendor))
       }

def split_category(df_vendor):
   for (category, count), df_category in df_vendor.groupby(
       ["category", "category_cnt"]
   ):
       yield {
           "name": category,
           "count": count,
           "subCategories": list(split_subcategory(df_category)),
       }

def split_subcategory(df_category):
   for (subcategory, count), df_subcategory in df_category.groupby(
       ["subcategory", "subcategory_cnt"]
   ):
       yield {
           "count": count,
           "name": subcategory,
             }


abc=list(split_df(df))
abc包含如下所示的数据。这是预期的结果

[{
    'count': 44,
    'vendor_name': 'Bureau of Labor Statistics',
    'categories': [{
        'count': 44,
        'name': 'Employment and wages',
        'subCategories': [{
            'count': 44,
            'name': 'Employment and wages'
        }]
    }]
}]
现在我尝试将其存储到json文件中

with open('your_file2.json', 'w') as f:
    for item in abc:
       f.write("%s\n" % item)
        #f.write(abc)
问题来了。这会以这种方式写入数据,请参阅下文,它不是有效的json格式。如果我尝试使用json转储,它会给出json序列化错误

你能帮帮我吗

{
    'count': 44,
    'vendor_name': 'Bureau of Labor Statistics',
    'categories': [{
        'count': 44,
        'name': 'Employment and wages',
        'subCategories': [{
            'count': 44,
            'name': 'Employment and wages'
        }]
    }]
}
预期结果:

[{
    "count": 44,
    "vendor_name": "Bureau of Labor Statistics",
    "categories": [{
        "count": 44,
        "name": "Employment and wages",
        "subCategories": [{
            "count": 44,
            "name": "Employment and wages"
        }]
    }]
}]
使用您的数据和PSL json为我提供:

TypeError: Object of type 'int64' is not JSON serializable
这仅仅意味着一些numpy对象存在于嵌套结构中,并且没有一个encode方法将其转换为JSON序列化

当对象本身缺少字符串转换时,强制encode使用字符串转换足以使代码正常工作:

import io
d = io.StringIO("datasource,datasource_cnt,category,category_cnt,subcategory,subcategory_cnt\nBureau of Labor Statistics,44,Employment and wages,44,Employment and wages,44")
df=pd.read_csv(d)

abc=list(split_df(df))

import json
json.dumps(abc, default=str)
它返回一个有效的JSON,但int转换为str:

如果不适合您的需要,请使用专用的:

这将返回请求的JSON:

'[{"vendor_name": "Bureau of Labor Statistics", "count": 44, "categories": [{"name": "Employment and wages", "count": 44, "subCategories": [{"count": 44, "name": "Employment and wages"}]}]}]'
另一种选择是在编码之前直接转换数据:

def split_category(df_vendor):
   for (category, count), df_category in df_vendor.groupby(
       ["category", "category_cnt"]
   ):
       yield {
           "name": category,
           "count": int(count), # Cast here before encoding
           "subCategories": list(split_subcategory(df_category)),
       }
生成有效的JSON文件:

[
  {
    "count": 44,
    "vendor_name": "Bureau of Labor Statistics",
    "categories": [
      {
        "count": 44,
        "name": "Employment and wages",
        "subCategories": [
          {
            "count": 44,
            "name": "Employment and wages"
          }
        ]
      }
    ]
  }
]

不要自己打印JSON DRTW,这不是一个好主意,而是使用编码器。在这种情况下,因为您打印的是单引号而不是双引号,所以标准会中断。是否有符合您要求的答案?如果是这样,您应该标记它。这将不适用于试用数据集,因为它的结构中有numpy.int64而不是int。您跳过了此部分,因为您将其作为Python结构导入,而不是从文件中导入。@jlandercy我使用的是他们文章中作为abc值提供的OP。他们说abc包含如下所示的数据。这是预期的结果。。我看不出你从哪里得到任何其他的试验数据集。显然,他们的问题是因为他们迭代列表,将每个元素作为文本写入一个简单的txt文件,并生成无效的json。看看我的答案,我确实在OP中找到了相关数据。您的解决方案将无法处理其数据集:复制粘贴StringIO,您将能够重现该问题。这并不意味着你的答案是错误的,它只是不能解决OP问题。@jlandercy,我现在明白了——这是因为他们使用pandas在数据帧中读取csv。最终,他们可以在屈服时将count转换为int,这应该可以解决问题,这就是numpy.int64的来源,已经建议过了。祝你今天愉快。如何将这些内容写入文本文件?你能用json写下这行吗。dump@ShankarPanda只需使用dump而不是像@Buran在其答案中所做的那样使用dump。
def split_category(df_vendor):
   for (category, count), df_category in df_vendor.groupby(
       ["category", "category_cnt"]
   ):
       yield {
           "name": category,
           "count": int(count), # Cast here before encoding
           "subCategories": list(split_subcategory(df_category)),
       }
import json

data = [{
    'count': 44,
    'vendor_name': 'Bureau of Labor Statistics',
    'categories': [{
        'count': 44,
        'name': 'Employment and wages',
        'subCategories': [{
            'count': 44,
            'name': 'Employment and wages'
        }]
    }]
}]

with open('your_file2.json', 'w') as f:
    json.dump(data, f, indent=2)
[
  {
    "count": 44,
    "vendor_name": "Bureau of Labor Statistics",
    "categories": [
      {
        "count": 44,
        "name": "Employment and wages",
        "subCategories": [
          {
            "count": 44,
            "name": "Employment and wages"
          }
        ]
      }
    ]
  }
]