Python 如何遍历整个目录？_Python_Json

Python 如何遍历整个目录？

python json

Python 如何遍历整个目录？,python,json,Python,Json,我试图遍历一个包含大约4000个json文件的目录，创建一个包含json文件所有元素的联合json文件。当我尝试这样做时，我只能得到大约一半的json文件进行连接。如何确保所有json文件都被迭代 json_files = [x for x in os.listdir(profile_directory_1) if x.endswith('.json')] company_profiles_1 = dict() for json_file in json_files: json_file

我试图遍历一个包含大约4000个json文件的目录，创建一个包含json文件所有元素的联合json文件。当我尝试这样做时，我只能得到大约一半的json文件进行连接。如何确保所有json文件都被迭代

json_files = [x for x in os.listdir(profile_directory_1) if x.endswith('.json')]
company_profiles_1 = dict()
for json_file in json_files:
    json_file_path = os.path.join('some/path', json_file)
    with open(json_file_path, 'r', encoding='utf-8') as f:
        company_profiles_1.update(json.load(f))

我预计len（company_profiles_1）将超过4000个，因为该目录包含4000多个json文件，但我只得到了2161个。

我一直在处理一个目录中的多个json文件，这就是我如何做到的！我处理了55000多个json文件，花了298秒完成了所有这些文件并创建了一个数据帧

import json
import pandas as pd
import os
import time
import numpy as np 

start_time = time.time()
d = {'date':[],'action':[],'account':[],'flag':[],'day':[],'month':[],'year':[],'reqid':[]}
for files in os.listdir('C:\\Users\\Username\\Documents\\Jsons'):
    x = 'C:\\Users\\Username\\Documents\\Jsons\\'+files
    with open(x, encoding="Latin-1") as w:
        data = json.load(w)
        for i in range(1,len(data['variables']['aer'])):
            d['date'].append(data['variables']['aer'][i]['date'])
            d['action'].append(data['variables']['aer'][i]['action'])
            d['account'].append(data['variables']['aer'][i]['account'])
            d['flag'].append(data['variables']['aer'][i]['flag'])
            d['day'].append(data['variables']['aer'][i]['day'])
            d['month'].append(data['variables']['aer'][i]['month'])
            d['year'].append(data['variables']['aer'][i]['year'])
            d['reqid'].append(data['reqid'])

此外，您还可以添加

try:

、

除值错误以外的错误：

和

除键错误以外的错误：

，以获得更好的性能

如果您想检查所经历的JSON数量，您当然可以创建一个列表，其中包含文件：

d = {'date':[],'action':[],'account':[],'flag':[],'day':[],'month':[],'year':[],'reqid':[]}
num_of_jsons = []
for files in os.listdir('C:\\Users\\Username\\Documents\\Jsons'):
    num_or_jsons.append(files)
    x = 'C:\\Users\\Username\\Documents\\Jsons\\'+files
    with open(x, encoding="Latin-1") as w:
        data = json.load(w)
        for i in range(1,len(data['variables']['aer'])):
            d['date'].append(data['variables']['aer'][i]['date'])
            d['action'].append(data['variables']['aer'][i]['action'])
            d['account'].append(data['variables']['aer'][i]['account'])
            d['flag'].append(data['variables']['aer'][i]['flag'])
            d['day'].append(data['variables']['aer'][i]['day'])
            d['month'].append(data['variables']['aer'][i]['month'])
            d['year'].append(data['variables']['aer'][i]['year'])
            d['reqid'].append(data['reqid'])

可能您有重复的密钥？另一个建议是，您可能希望使用

glob

模块，因为您的路径模式几乎是固定的。@Selcuk解决方案是将

company\u profiles\u 1

制作为列表而不是dict，是吗？@PedroLobito在不知道OP的确切要求的情况下，我们不能对此发表评论。保留相同的键并合并重复的值也是可以接受的。哦，我认为将company_profiles_1制作成一个列表是可行的。我会尝试进一步测试。嗨，谢谢你的回答。我已经尝试过你的方法，看起来我正在经历的JSON数量是正确的——目录中的文件数量。但在公司简介之后，我仍然只能得到其中的一半。更新（数据），如上面代码所示。请记住，当两个词典共享密钥时，更新会替换信息，否则会添加信息。例如，您确定json文件编号2000不可能有一个同样存在于json文件编号3430中的密钥吗？