使用python将多个JSON对象作为一个对象写入单个文件_Python_Json_Foreman_Theforeman

使用python将多个JSON对象作为一个对象写入单个文件

python json

使用python将多个JSON对象作为一个对象写入单个文件,python,json,foreman,theforeman,Python,Json,Foreman,Theforeman,我正在使用python访问foreman API，以收集foreman知道的所有主机的相关信息。不幸的是，v1 foreman API中没有获取所有主机事实（或类似事实），因此我必须遍历所有主机并获取信息。这样做让我遇到了一个恼人的问题。对给定主机的每次调用都会返回一个JSON对象，如下所示： { "host1.com": { "apt_update_last_success": "1452187711", "architecture": "amd64", "au

我正在使用python访问foreman API，以收集foreman知道的所有主机的相关信息。不幸的是，v1 foreman API中没有获取所有主机事实（或类似事实），因此我必须遍历所有主机并获取信息。这样做让我遇到了一个恼人的问题。对给定主机的每次调用都会返回一个JSON对象，如下所示：

{
  "host1.com": {
    "apt_update_last_success": "1452187711", 
    "architecture": "amd64", 
    "augeasversion": "1.2.0", 
    "bios_release_date": "06/03/2015", 
    "bios_vendor": "Dell Inc."
   }
}

这很好，当我附加下一个主机的信息时，问题就出现了。然后我得到一个json文件，它看起来像这样：

{
  "host1.com": {
    "apt_update_last_success": "1452187711", 
    "architecture": "amd64", 
    "augeasversion": "1.2.0", 
    "bios_release_date": "06/03/2015", 
    "bios_vendor": "Dell Inc."
}
}{
"host2.com": {
    "apt_update_last_success": "1452703454", 
    "architecture": "amd64", 
    "augeasversion": "1.2.0", 
    "bios_release_date": "06/03/2015", 
    "bios_vendor": "Dell Inc."
   }
}

下面是执行此操作的代码：

for i in hosts_data:
    log.info("Gathering host facts for host: {}".format(i['host']['name']))
    try:
        facts = requests.get(foreman_host+api+"hosts/{}/facts".format(i['host']['id']), auth=(username, password))
        if hosts.status_code != 200:
            log.error("Unable to connect to Foreman! Got retcode '{}' and error message '{}'"
            .format(hosts.status_code, hosts.text))
            sys.exit(1)
    except requests.exceptions.RequestException as e:
        log.error(e)
    facts_data = json.loads(facts.text)
    log.debug(facts_data)
    with open(results_file, 'a') as f:
        f.write(json.dumps(facts_data, sort_keys=True, indent=4))

以下是我需要文件的外观：

{
"host1.com": {
    "apt_update_last_success": "1452187711",
    "architecture": "amd64",
    "augeasversion": "1.2.0",
    "bios_release_date": "06/03/2015",
    "bios_vendor": "Dell Inc."
},
"host2.com": {
    "apt_update_last_success": "1452703454",
    "architecture": "amd64",
    "augeasversion": "1.2.0",
    "bios_release_date": "06/03/2015",
    "bios_vendor": "Dell Inc."
  }
}

不要在循环中写入json，而是将数据插入具有正确结构的

dict

。然后在循环完成后将dict写入json

这假设数据集适合内存。

不要在循环中写入json，而是将数据插入具有正确结构的

dict

。然后在循环完成后将dict写入json

这假设您的数据集适合内存。

最好将所有数据汇集到一个dict中，然后一次性将其全部写入，而不是每次都写入循环中

d = {}
for i in hosts_data:
    log.info("Gathering host facts for host: {}".format(i['host']['name']))
    try:
        facts = requests.get(foreman_host+api+"hosts/{}/facts".format(i['host']['id']), auth=(username, password))
        if hosts.status_code != 200:
            log.error("Unable to connect to Foreman! Got retcode '{}' and error message '{}'"
            .format(hosts.status_code, hosts.text))
            sys.exit(1)
    except requests.exceptions.RequestException as e:
        log.error(e)
    facts_data = json.loads(facts.text)
    log.debug(facts_data)
    d.update(facts_data)  #add to dict
# write everything at the end
with open(results_file, 'a') as f:
    f.write(json.dumps(d, sort_keys=True, indent=4))

最好将所有数据汇集到一个dict中，然后一次将其全部写出来，而不是每次都写在循环中

d = {}
for i in hosts_data:
    log.info("Gathering host facts for host: {}".format(i['host']['name']))
    try:
        facts = requests.get(foreman_host+api+"hosts/{}/facts".format(i['host']['id']), auth=(username, password))
        if hosts.status_code != 200:
            log.error("Unable to connect to Foreman! Got retcode '{}' and error message '{}'"
            .format(hosts.status_code, hosts.text))
            sys.exit(1)
    except requests.exceptions.RequestException as e:
        log.error(e)
    facts_data = json.loads(facts.text)
    log.debug(facts_data)
    d.update(facts_data)  #add to dict
# write everything at the end
with open(results_file, 'a') as f:
    f.write(json.dumps(d, sort_keys=True, indent=4))

为了安全/一致性，您需要加载旧数据，对其进行变异，然后将其写回

使用和

写入将当前更改为：
# If file guaranteed to exist, can use r+ and avoid initial seek
with open(results_file, 'a+') as f:
    f.seek(0)
    combined_facts = json.load(f)
    combined_facts.update(facts_data)
    f.seek(0)
    json.dump(combined_facts, f, sort_keys=True, indent=4)
    f.truncate()  # In case new JSON encoding smaller, e.g. due to replaced key

注意：如果可能的话，您希望使用来最小化不必要的I/O，如果数据检索应该是逐段进行的，并且在每个项目可用时立即更新，那么这就是您应该如何做的
仅供参考，不安全的方法基本上是找到后面的大括号，删除它，然后写出一个逗号，后跟新的JSON（从JSON表示中删除前面的大括号）。它的I/O强度要小得多，但也不太安全，不清除重复项，不对主机进行排序，根本不验证输入文件，等等。所以不要这样做。
为了安全性/一致性，需要加载旧数据，对其进行变异，然后将其写回
使用
和写入将当前更改为：
# If file guaranteed to exist, can use r+ and avoid initial seek
with open(results_file, 'a+') as f:
    f.seek(0)
    combined_facts = json.load(f)
    combined_facts.update(facts_data)
    f.seek(0)
    json.dump(combined_facts, f, sort_keys=True, indent=4)
    f.truncate()  # In case new JSON encoding smaller, e.g. due to replaced key

注意：如果可能的话，您希望使用来最小化不必要的I/O，如果数据检索应该是逐段进行的，并且在每个项目可用时立即更新，那么这就是您应该如何做的
仅供参考，不安全的方法基本上是找到后面的大括号，删除它，然后写出一个逗号，后跟新的JSON（从JSON表示中删除前面的大括号）。它的I/O强度要小得多，但也不太安全，不会清除重复项，不会对主机进行排序，根本不会验证输入文件，等等。所以不要这样做。
谢谢@ShadowRanger！关于安全和IO影响的良好信息。非常感谢！谢谢@ShadowRanger！关于安全和IO影响的良好信息。非常感谢！NP唯一需要注意的是，dict.update（）
在密钥已经存在的情况下替换内容（例如，如果hosts\u data
包含重复项）。唯一需要注意的是，dict.update（）
会在密钥已经存在的情况下替换内容（例如，如果hosts\u data
包含重复项）。