Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/287.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python3-将变量导入字典_Python_Python 3.x_Dictionary_Web Scraping_Yaml - Fatal编程技术网

Python3-将变量导入字典

Python3-将变量导入字典,python,python-3.x,dictionary,web-scraping,yaml,Python,Python 3.x,Dictionary,Web Scraping,Yaml,我正在尝试将下面的print命令的输出输入到字典中(未成功),以便随后将其导出到CSV 如何将parseddata(以下打印输出)输入词典 示例输入文件: <html> <body> <p>{ success:true ,results:3,rows:[{ISIN:"INE134E01011",Ind:"-",Audited:"Un-Audited",Cumulative:"N‌​on-cumulative",Consolidated:"Non-Consoli

我正在尝试将下面的print命令的输出输入到字典中(未成功),以便随后将其导出到CSV

如何将
parseddata
(以下打印输出)输入词典

示例输入文件:

<html>
<body>
<p>{ success:true ,results:3,rows:[{ISIN:"INE134E01011",Ind:"-",Audited:"Un-Audited",Cumulative:"N‌​on-cumulative",Consolidated:"Non-Consolidated",FilingDate:"14-Aug-2015 15:39",SeqNumber:"1001577"},{ISIN:"INE134E01011",Ind:"-",Audited:"Un-Audited",Cu‌​mulative:"Non-cumulative",Consolidated:"Non-Consolidated",FilingDate:"30-May-2015 14:37",SeqNumber:"129901"},{ISIN:"INE134E01011",Ind:"-",Audited:"Un-Audited",Cum‌​ulative:"Non-cumulative",Consolidated:"Non-Consolidated",FilingDate:"17-Feb-2015 14:57",SeqNumber:"126171"}]}</p>
</body>
</html>
print(parseddata)
的输出为:

{ISIN:"INE134E01011",Ind:"-",Audited:"Un-Audited",Cumulative:"Non-cumulative",Consolidated:"Non-Consolidated",FilingDate:"14-Aug-2015 15:39",SeqNumber:"1001577"},{ISIN:"INE134E01011",Ind:"-",Audited:"Un-Audited",Cumulative:"Non-cumulative",Consolidated:"Non-Consolidated",FilingDate:"30-May-2015 14:37",SeqNumber:"129901"},{ISIN:"INE134E01011",Ind:"-",Audited:"Un-Audited",Cumulative:"Non-cumulative",Consolidated:"Non-Consolidated",FilingDate:"17-Feb-2015 14:57",SeqNumber:"126171"}]}

除了结尾的括号外,这是有效的JSON这是有效的YAML(我在最初的回答中犯了一个错误;JavaScript对象可以在不引用属性的情况下声明,但JSON可移植格式不允许这样做;YAML允许这样做)

按照说明使用
PyYAML
解析数据。手册
split
-ing和
lstrip
正在伤害您,并使之变得更加困难。只需获取
文本
,然后使用
yaml
进行解析(这是必须单独安装的第三方模块):


您可以阅读有关的详细信息。

这看起来像一个键值映射,带有
ISIN
一个键和
“INE134E01011”
一个值。但它不是JSON,因为键没有引号,也不是YAML,因为纯标量键(即没有引号的字符串必须是(

如果您将输出字符串拆分为“”部分:

test_str = (
    '{ISIN:"INE134E01011",Ind:"-",'
    'Audited:"Un-Audited",'
    'Cumulative:"Non-cumulative",'
    'Consolidated:"Non-Consolidated",'
    'FilingDate:"14-Aug-2015 15:39",'
    'SeqNumber:"1001577"},'
    '{ISIN:"INE134E01011",'  # new mapping starts
    'Ind:"-",'
    'Audited:"Un-Audited",'
    'Cumulative:"Non-cumulative",'
    'Consolidated:"Non-Consolidated",'
    'FilingDate:"30-May-2015 14:37",'
    'SeqNumber:"129901"},'
    '{ISIN:"INE134E01011",'    # new mapping starts
    'Ind:"-",'
    'Audited:"Un-Audited",'
    'Cumulative:"Non-cumulative",'
    'Consolidated:"Non-Consolidated",'
    'FilingDate:"17-Feb-2015 14:57",'
    'SeqNumber:"126171"}]}'
)
它与您的输入相同:

test_org = '{ISIN:"INE134E01011",Ind:"-",Audited:"Un-Audited",Cumulative:"Non-cumulative",Consolidated:"Non-Consolidated",FilingDate:"14-Aug-2015 15:39",SeqNumber:"1001577"},{ISIN:"INE134E01011",Ind:"-",Audited:"Un-Audited",Cumulative:"Non-cumulative",Consolidated:"Non-Consolidated",FilingDate:"30-May-2015 14:37",SeqNumber:"129901"},{ISIN:"INE134E01011",Ind:"-",Audited:"Un-Audited",Cumulative:"Non-cumulative",Consolidated:"Non-Consolidated",FilingDate:"17-Feb-2015 14:57",SeqNumber:"126171"}]}'
assert test_str == test_org
拆分表明实际上有3个映射,后面有一个
]
}
]
表示有一个列表,这与用逗号分隔3个映射是一致的。匹配的
[
丢失了,因为您在
上拆分之后:['
,您将
lstrip()

您可以轻松地操作字符串,以便YAML可以对其进行解析,但结果是一个列表:

import ruamel.yaml
test_str = '[' + test_str.replace(':"', ': "').rstrip('}')

data = ruamel.yaml.load(test_str)
print(type(data))
印刷品:

<class 'list'>
但是,如果您的最终目标是CSV文件,我看不到从列表到dict的理由。如果您从YAML解析器获取输出,您可以执行以下操作:

import csv
with open('output.csv', 'w', newline='') as fp:
    csvwriter = csv.writer(fp)
    csvwriter.writerow(data[0].keys())  # header of common dict keys
    for elem in data:
        csvwriter.writerow(elem.values())  # values
要获取包含以下内容的CSV文件:

ISIN,Ind,Consolidated,Cumulative,Audited,FilingDate
INE134E01011,-,Non-Consolidated,Non-cumulative,Un-Audited,14-Aug-2015 15:39
INE134E01011,-,Non-Consolidated,Non-cumulative,Un-Audited,30-May-2015 14:37
INE134E01011,-,Non-Consolidated,Non-cumulative,Un-Audited,17-Feb-2015 14:57
我没有用
\
转义换行符,而是用括号将多行定义变成一个字符串,这样我就可以更容易地在行上添加注释

²而不是重新添加[”,你当然不应该一开始就把它去掉

但是
parseddata
是什么样子的呢?尤里布,我编辑了这篇文章来展示parseddata的样子。thanks@zs_python:您能否提供一个要处理的示例输入文件,以便人们可以对其运行测试用例。示例输入文件已在上述问题中添加,谢谢Hanks ShadowRanger,我想问题出在“结尾处的散乱的近括号/括号”上,我该如何摆脱它呢?@zs_python:预料到了这一点,并在您询问之前添加了一个示例。:-)很有可能,原始数据是有效的
json
,只要您感兴趣的对象是只有一个属性的对象(持有一个元素数组)的数组属性中的唯一条目。您可能只需
json。加载整个内容,然后访问并分配
数据\u as\u dict=whole\u thing\u as\u dict['name_of_singleton_key'][0]
并避免显式的
拆分和
lstrip
-ing。感谢您帮助删除迷路的ShadowRanger。上面的示例向我抛出了一个错误:JSONDecodeError:期望属性名称包含在双引号中:第1行第2列(字符1)我刚刚发布了问题中的示例输入文件,以便更清楚地了解我正在试图解析的内容。谢谢Anthon。那太完美了,只是为我做了准确的工作!非常感谢您为我解释它所做的所有努力。谢谢@ShadowRanger,您的努力增加了我的python学习,非常有帮助同样,NoOB也被你们投入到帮助我学习的努力所压倒。谢谢,向前!@ ZSyPython,如果这解决了你的问题,请考虑接受答案(点击这个答案旁边的标记)。这表明你的问题已经解决了(他们可能不会一直读到你的评论)。,并将其标记在数据库中。感谢@anthon的帮助,我已接受了指导答案。回头见:)
ddata = {}
for elem in data:
    k = elem.pop('SeqNumber')
    ddata[k] = elem
import csv
with open('output.csv', 'w', newline='') as fp:
    csvwriter = csv.writer(fp)
    csvwriter.writerow(data[0].keys())  # header of common dict keys
    for elem in data:
        csvwriter.writerow(elem.values())  # values
ISIN,Ind,Consolidated,Cumulative,Audited,FilingDate
INE134E01011,-,Non-Consolidated,Non-cumulative,Un-Audited,14-Aug-2015 15:39
INE134E01011,-,Non-Consolidated,Non-cumulative,Un-Audited,30-May-2015 14:37
INE134E01011,-,Non-Consolidated,Non-cumulative,Un-Audited,17-Feb-2015 14:57