Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/14.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python csv到嵌套JSON?_Python_Json_Csv - Fatal编程技术网

Python csv到嵌套JSON?

Python csv到嵌套JSON?,python,json,csv,Python,Json,Csv,我正在尝试将平面结构的CSV转换为嵌套的JSON结构。CSV由SQL生成,SQL为每个主id创建多行。CSV的结构如下: PrimaryId,FirstName,LastName,City,CarName,DogName 100,约翰,史密斯,纽约,丰田,斯派克 100,约翰,史密斯,纽约,宝马,斯派克 100,约翰,史密斯,纽约,丰田,拉斯蒂 100,约翰,史密斯,纽约,宝马,拉斯蒂 101,本,斯旺,悉尼,大众,巴迪 101,本,斯旺,悉尼,福特,伙计 101,本,斯旺,悉尼,奥迪,巴迪 1

我正在尝试将平面结构的CSV转换为嵌套的JSON结构。CSV由SQL生成,SQL为每个主id创建多行。CSV的结构如下:

PrimaryId,FirstName,LastName,City,CarName,DogName
100,约翰,史密斯,纽约,丰田,斯派克
100,约翰,史密斯,纽约,宝马,斯派克
100,约翰,史密斯,纽约,丰田,拉斯蒂
100,约翰,史密斯,纽约,宝马,拉斯蒂
101,本,斯旺,悉尼,大众,巴迪
101,本,斯旺,悉尼,福特,伙计
101,本,斯旺,悉尼,奥迪,巴迪
101,本,斯旺,悉尼,大众,马克斯
101号,本,斯旺,悉尼,福特,马克斯
101,本,斯旺,悉尼,奥迪,马克斯
102,朱莉娅,布朗,伦敦,米尼,露西
所需的JSON输出是:

{
    "data": [
        {
            "City": "NewYork", 
            "FirstName": "John", 
            "PrimaryId": 100, 
            "LastName": "Smith", 
            "CarName": [
                "Toyota", 
                "BMW"
            ], 
            "DogName": [
                "Spike", 
                "Rusty"
            ]
        }, 
        {
            "City": "Sydney", 
            "FirstName": "Ben", 
            "PrimaryId": 101, 
            "LastName": "Swan", 
            "CarName": [
                "Volkswagen", 
                "Ford", 
                "Audi"
            ], 
            "DogName": [
                "Buddy", 
                "Max"
            ]
        }, 
        {
            "City": "London", 
            "FirstName": "Julia", 
            "PrimaryId": 102, 
            "LastName": "Brown", 
            "CarName": [
                "Mini"
            ], 
            "DogName": [
                "Lucy"
            ]
        }
    ]
}

两者都有帮助,但我还没有创建正确的结构。

您转换为有效csv的数据保存在
数据中。csv

PrimaryId,FirstName,LastName,City,CarName,DogName
100,John,Smith,NewYork,Toyota,Spike
100,John,Smith,NewYork,BMW,Spike
100,John,Smith,NewYork,Toyota,Rusty
100,John,Smith,NewYork,BMW,Rusty
101,Ben,Swan,Sydney,Volkswagen,Buddy
101,Ben,Swan,Sydney,Ford,Buddy
101,Ben,Swan,Sydney,Audi,Buddy
101,Ben,Swan,Sydney,Volkswagen,Max
101,Ben,Swan,Sydney,Ford,Max
101,Ben,Swan,Sydney,Audi,Max
102,Julia,Brown,London,Mini,Lucy
使用熊猫完成繁重的工作,并假设此csv文件有效,这是实现您想要的一种方式:

import json
import pandas as pd

df = pd.read_csv('data.csv')

def get_nested_rec(key, grp):
    rec = {}
    rec['PrimaryId'] = key[0]
    rec['FirstName'] = key[1]
    rec['LastName'] = key[2]
    rec['City'] = key[3]

    for field in ['CarName','DogName']:
        rec[field] = list(grp[field].unique())

    return rec

records = []
for key, grp in df.groupby(['PrimaryId','FirstName','LastName','City']):
    rec = get_nested_rec(key, grp)
    records.append(rec)

records = dict(data = records)

print(json.dumps(records, indent=4))
结果是:

{
    "data": [
        {
            "City": "NewYork", 
            "FirstName": "John", 
            "PrimaryId": 100, 
            "LastName": "Smith", 
            "CarName": [
                "Toyota", 
                "BMW"
            ], 
            "DogName": [
                "Spike", 
                "Rusty"
            ]
        }, 
        {
            "City": "Sydney", 
            "FirstName": "Ben", 
            "PrimaryId": 101, 
            "LastName": "Swan", 
            "CarName": [
                "Volkswagen", 
                "Ford", 
                "Audi"
            ], 
            "DogName": [
                "Buddy", 
                "Max"
            ]
        }, 
        {
            "City": "London", 
            "FirstName": "Julia", 
            "PrimaryId": 102, 
            "LastName": "Brown", 
            "CarName": [
                "Mini"
            ], 
            "DogName": [
                "Lucy"
            ]
        }
    ]
}

转换为有效csv的数据保存在
data.csv

PrimaryId,FirstName,LastName,City,CarName,DogName
100,John,Smith,NewYork,Toyota,Spike
100,John,Smith,NewYork,BMW,Spike
100,John,Smith,NewYork,Toyota,Rusty
100,John,Smith,NewYork,BMW,Rusty
101,Ben,Swan,Sydney,Volkswagen,Buddy
101,Ben,Swan,Sydney,Ford,Buddy
101,Ben,Swan,Sydney,Audi,Buddy
101,Ben,Swan,Sydney,Volkswagen,Max
101,Ben,Swan,Sydney,Ford,Max
101,Ben,Swan,Sydney,Audi,Max
102,Julia,Brown,London,Mini,Lucy
使用熊猫完成繁重的工作,并假设此csv文件有效,这是实现您想要的一种方式:

import json
import pandas as pd

df = pd.read_csv('data.csv')

def get_nested_rec(key, grp):
    rec = {}
    rec['PrimaryId'] = key[0]
    rec['FirstName'] = key[1]
    rec['LastName'] = key[2]
    rec['City'] = key[3]

    for field in ['CarName','DogName']:
        rec[field] = list(grp[field].unique())

    return rec

records = []
for key, grp in df.groupby(['PrimaryId','FirstName','LastName','City']):
    rec = get_nested_rec(key, grp)
    records.append(rec)

records = dict(data = records)

print(json.dumps(records, indent=4))
结果是:

{
    "data": [
        {
            "City": "NewYork", 
            "FirstName": "John", 
            "PrimaryId": 100, 
            "LastName": "Smith", 
            "CarName": [
                "Toyota", 
                "BMW"
            ], 
            "DogName": [
                "Spike", 
                "Rusty"
            ]
        }, 
        {
            "City": "Sydney", 
            "FirstName": "Ben", 
            "PrimaryId": 101, 
            "LastName": "Swan", 
            "CarName": [
                "Volkswagen", 
                "Ford", 
                "Audi"
            ], 
            "DogName": [
                "Buddy", 
                "Max"
            ]
        }, 
        {
            "City": "London", 
            "FirstName": "Julia", 
            "PrimaryId": 102, 
            "LastName": "Brown", 
            "CarName": [
                "Mini"
            ], 
            "DogName": [
                "Lucy"
            ]
        }
    ]
}

下面是使用
csv.DictReader
执行此操作的一般方法

从加载数据开始:

import csv
import itertools
with open('stuff.csv', 'rb') as csvfile:
    all_ = list(csv.DictReader(csvfile))
现在,您可以使用
itertools.groupby
对每个组进行分组和处理。比如说

d = []
for k, g in itertools.groupby(
        all_, 
        key=lambda r: (r['PrimaryId'], r[' LastName'])):
    d.append({
        'PrimaryId': k[0],
        'LastName': k[1],
        'CarName': [e[' CarName'] for e in g]
        })
将按主id和姓氏分组,并列出车辆列表


一旦你有了这样的东西,你就可以使用了。

下面是使用csv.DictReader的一般方法

从加载数据开始:

import csv
import itertools
with open('stuff.csv', 'rb') as csvfile:
    all_ = list(csv.DictReader(csvfile))
现在,您可以使用
itertools.groupby
对每个组进行分组和处理。比如说

d = []
for k, g in itertools.groupby(
        all_, 
        key=lambda r: (r['PrimaryId'], r[' LastName'])):
    d.append({
        'PrimaryId': k[0],
        'LastName': k[1],
        'CarName': [e[' CarName'] for e in g]
        })
将按主id和姓氏分组,并列出车辆列表


一旦你有了这样的东西,你可以直接使用。

请在这里发布你的代码,即你尝试了什么。此外,您的csv似乎有额外的空格,您的json也肯定不是json。请在此处发布您的代码,即您尝试了什么。此外,您的csv似乎有额外的空间,您的json也肯定不是json。