Python JSON中的一个属性位于两个单独的列中_Python_Json_Csv

Python JSON中的一个属性位于两个单独的列中

python json csv

Python JSON中的一个属性位于两个单独的列中,python,json,csv,Python,Json,Csv,我现在遇到一个问题，如何在csv文件中分成两列，如下所示：我想在单独的栏中有标准的价格和可转换的价格。但是，它们位于一个名为“aws:offerterOfferingClass”的属性下。您知道如何在一种类型的实例下使用单独的列，其中包含可转换价格和标准价格吗？我试着用那些如果，但它停止了错误。非常感谢您事先的帮助导入请求进口警告作为pd进口熊猫将numpy作为np导入警告。过滤器警告（“忽略”）地区=['ap-northeast-1'、'ap-south-1'、'ap-south

我现在遇到一个问题，如何在csv文件中分成两列，如下所示：

我想在单独的栏中有标准的价格和可转换的价格。但是，它们位于一个名为“aws:offerterOfferingClass”的属性下。您知道如何在一种类型的实例下使用单独的列，其中包含可转换价格和标准价格吗？我试着用那些如果，但它停止了错误。非常感谢您事先的帮助

导入请求
进口警告
作为pd进口熊猫
将numpy作为np导入
警告。过滤器警告（“忽略”）
地区=['ap-northeast-1'、'ap-south-1'、'ap-south-1'、'ap-south-2'、'eu-central-1'、'eu-west-1'、'eu-west-2'、'us-east-2'、'us-west-1'、'us-west-2']
OS=['linux'、'rhel'、'windows']
链接=[]
对于区域中的区域：
对于操作系统中的系统：
links.append（“https://a0.p.awsstatic.com/pricing/1.0/ec2/region/“+region+”/reserved instance/“+system+”/index.json？”）
superdict=[]
对于链接中的链接：
打印（“从：+链接下载数据”）
res=requests.get（link，verify=False）.json（）
superdict.append（res）
df={“地区”：[]，“系统”：[]，“类型”：[]，“标准”：[]，“可转换”：[]，“按需”：[]}
对于superdict中的res：
对于res[‘价格’中的项目：
如果项目['attributes']['aws:OfferterLeaseLength']=“3年”\
项目['attributes']['aws:offerTermPurchaseOption']==“无预付款”：
如果项['attributes']['aws:ec2:operatingSystem']==“Linux”\
和项['attributes']['aws:ec2:instanceType'].endswith（'.large'）：
df[“Region”]。追加（项['attributes']['aws:Region']）
df[“System”].append（“Linux/UNIX”）
df[“Type”].append（项['attributes']['aws:ec2:instanceType']）
df[“随需应变”]。追加（项目['calculatedPrice']['onDemandRate']['USD']）
如果项目['attributes']['aws:OfferterOfferingClass']==“标准”：
df[“标准”].追加（浮动（项目['calculatedPrice']['effectiveHourlyRate']['USD']））
df[“可转换”].追加（np.NaN）
elif项['attributes']['aws:OfferterOfferingClass']==“可转换”：
df[“可转换”].追加（浮动（项目['calculatedPrice']['effectiveHourlyRate']['USD']））
df[“标准”].追加（np.NaN）
elif项['attributes']['aws:ec2:operatingSystem']==“RHEL”：
df[“Region”]。追加（项['attributes']['aws:Region']）
df[“System”].append（“Red Hat Enterprise Linux”）
df[“Type”].append（项['attributes']['aws:ec2:instanceType']）
df[“随需应变”]。追加（项目['calculatedPrice']['onDemandRate']['USD']）
如果项目['attributes']['aws:OfferterOfferingClass']==“标准”：
df[“标准”].追加（浮动（项目['calculatedPrice']['effectiveHourlyRate']['USD']））
df[“可转换”].追加（np.NaN）
elif项['attributes']['aws:OfferterOfferingClass']==“可转换”：
df[“可转换”].追加（浮动（项目['calculatedPrice']['effectiveHourlyRate']['USD']））
df[“标准”].追加（np.NaN）
elif项['attributes']['aws:ec2:operatingSystem']==“Windows”：
df[“Region”]。追加（项['attributes']['aws:Region']）
df[“系统”]。附加（“Windows”）
df[“Type”].append（项['attributes']['aws:ec2:instanceType']）
df[“随需应变”]。追加（项目['calculatedPrice']['onDemandRate']['USD']）
如果项目['attributes']['aws:OfferterOfferingClass']==“标准”：
df[“标准”].追加（浮动（项目['calculatedPrice']['effectiveHourlyRate']['USD']））
df[“可转换”].追加（np.NaN）
elif项['attributes']['aws:OfferterOfferingClass']==“可转换”：
df[“可转换”].追加（浮动（项目['calculatedPrice']['effectiveHourlyRate']['USD']））
df[“标准”].追加（np.NaN）
数据=来自dict（df）的pd.DataFrame
data.to_csv（r'path_to_file.csv'，index=False）

这就是我现在拥有的：

我想要的是：

您的问题是

中的以下几行，如果：
if item['attributes']['aws:offerTermOfferingClass'] =="standard":
    df["Standard"].append(float(item['calculatedPrice']['effectiveHourlyRate']['USD']))
elif item['attributes']['aws:offerTermOfferingClass'] =="convertible":
    df["Convertible"].append(float(item['calculatedPrice']['effectiveHourlyRate']['USD']))

这样，您只需在列表中的一个元素中填充一个元素。这意味着在一次迭代之后，您的dictdf
可能如下所示：
{"Region":["EU"],"System":["Windows"], \
  "Type":[3],"Standard":[12.00],"Convertible":[],"On demand":[8]}

if item['attributes']['aws:offerTermOfferingClass'] =="standard":
    df["Standard"].append(float(item['calculatedPrice']['effectiveHourlyRate']['USD']))
    df["Convertible"].append(np.NaN) # or another default value
elif item['attributes']['aws:offerTermOfferingClass'] =="convertible":
    df["Convertible"].append(float(item['calculatedPrice']['effectiveHourlyRate']['USD']))
    df["Standard"].append(np.NaN)

经过两次迭代后：
{"Region":["EU", "JAP"],"System":["Windows", "Linux/UNIX"], \
  "Type": [3,4],"Standard":[12.00],"Convertible":[18.00],"On demand":[8,13]}

那么数据帧应该是什么样子呢？标准和敞篷车只有一个元素？其他人都有两个。你不能建立这样的df。这就是错误告诉您的：ValueError:数组的长度必须相同

因此，基本上修复方法如下所示：
{"Region":["EU"],"System":["Windows"], \
  "Type":[3],"Standard":[12.00],"Convertible":[],"On demand":[8]}

if item['attributes']['aws:offerTermOfferingClass'] =="standard":
    df["Standard"].append(float(item['calculatedPrice']['effectiveHourlyRate']['USD']))
    df["Convertible"].append(np.NaN) # or another default value
elif item['attributes']['aws:offerTermOfferingClass'] =="convertible":
    df["Convertible"].append(float(item['calculatedPrice']['effectiveHourlyRate']['USD']))
    df["Standard"].append(np.NaN)

如果某些值为空，则合并行
创建数据帧后，您可以尝试以下操作：
df_ = df.replace('', np.nan).ffill().bfill()
pd.concat([
        df_[df_.duplicated()],
        df.loc[df_.drop_duplicates(keep=False).index]
    ])

参考：
群比
你也可以用Groupby解决这个问题
data = data.groupby(["Region","System","Type","On demand"]).sum().replace(0,np.nan)

你能给出你得到的错误吗？python ValueError:数组必须都是相同长度的
我有类似的东西。谢谢你的回复。它几乎不起作用，但我希望将这些值放在一种类型的实例下。现在这仍然是拆分的，但在标准列或可转换列中有空值。@Lukasz不知道在一种类型的实例下是什么意思。@Lukasz dataframe应该是什么样子？你期待什么而不是空牢房？您应该提供示例输入和所需的输出。现在使用您发送的代码，我有一行示例，具体的实例名称和价格仅在一列中。在下一行中，我有相同类型实例的名称，但价格值仅在“可转换”列中。我希望有一行，而不是两行，都有价格。@Lukasz这就是您的代码所做的。没有一个输入和输出的例子，很难理解你真正想要的是什么