Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/321.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从嵌套字典列表创建统计嵌套字典_Python_Python 2.7_Dictionary_Nested - Fatal编程技术网

Python 从嵌套字典列表创建统计嵌套字典

Python 从嵌套字典列表创建统计嵌套字典,python,python-2.7,dictionary,nested,Python,Python 2.7,Dictionary,Nested,我有许多嵌套字典的列表,每个字典代表一个Windows操作系统,如下所示: windows1 = {"version": "windows 10", "installed apps": {"chrome": "installed", "python": {"python versi

我有许多嵌套字典的列表,每个字典代表一个Windows操作系统,如下所示:

windows1 = {"version": "windows 10", 
            "installed apps": {"chrome": "installed",
                               "python": {"python version": "2.7", 
                                          "folder": "c:\python27"},
                               "minecraft": "not installed"}}

windows2 = {"version": "windows XP", 
            "installed apps": {"chrome": "not installed",
                               "python": {"python version": "not installed", 
                                          "folder": "c:\python27"},
                               "minecraft": "not installed"}}
stats_dic = {"version": {"windows 10": 20,
                         "windows 7": 4, 
                         "windows XP": 11},
             "installed apps": {"chrome": {"installed": 12, 
                                           "not installed": 6},
                                "python": {"python version": {"2.7": 4, "3.6": 8, "3.7": 2}, 
                                "minecraft": {"installed": 15, 
                                              "not installed": 2}}}
# Statistics
stats = {}

for c in df.columns:
    #
    if 'folder' in c:
        continue
    
    uniques = df[c].unique()
    
    # Count how many times a value appears per column
    counts = {}
    for u in uniques:
        tmp_u = u if not '\\' in u else u.replace('\\','\\\\')
        counts[u] = int(df[c].str.count('^'+tmp_u).sum())
    
    # Recreate the structure of nested dictionary
    build_nested(stats, c, counts)

stats
>>>{'version': {'windows 10': 1, 'windows XP': 1},
    'installed apps': {'chrome': {'installed': 1, 'not installed': 1},
    'minecraft': {'not installed': 2},
    'python': {'python version': {'2.7': 1, 'not installed': 1}}}}
我的目标是创建一个最终的嵌套字典,以存储列表的统计信息,如下所示:

windows1 = {"version": "windows 10", 
            "installed apps": {"chrome": "installed",
                               "python": {"python version": "2.7", 
                                          "folder": "c:\python27"},
                               "minecraft": "not installed"}}

windows2 = {"version": "windows XP", 
            "installed apps": {"chrome": "not installed",
                               "python": {"python version": "not installed", 
                                          "folder": "c:\python27"},
                               "minecraft": "not installed"}}
stats_dic = {"version": {"windows 10": 20,
                         "windows 7": 4, 
                         "windows XP": 11},
             "installed apps": {"chrome": {"installed": 12, 
                                           "not installed": 6},
                                "python": {"python version": {"2.7": 4, "3.6": 8, "3.7": 2}, 
                                "minecraft": {"installed": 15, 
                                              "not installed": 2}}}
# Statistics
stats = {}

for c in df.columns:
    #
    if 'folder' in c:
        continue
    
    uniques = df[c].unique()
    
    # Count how many times a value appears per column
    counts = {}
    for u in uniques:
        tmp_u = u if not '\\' in u else u.replace('\\','\\\\')
        counts[u] = int(df[c].str.count('^'+tmp_u).sum())
    
    # Recreate the structure of nested dictionary
    build_nested(stats, c, counts)

stats
>>>{'version': {'windows 10': 1, 'windows XP': 1},
    'installed apps': {'chrome': {'installed': 1, 'not installed': 1},
    'minecraft': {'not installed': 2},
    'python': {'python version': {'2.7': 1, 'not installed': 1}}}}
如您所见,我试图获取列表中每个windows dict中除python文件夹外的所有值,将它们作为最终嵌套统计dict中的键。这些键的值将是它们的计数器,它们必须保持与以前相同的嵌套方式

经过一些阅读,我了解到这可以在递归函数中完成,并且我已经尝试了几个函数,但没有成功。我在不考虑python文件夹的情况下得到的最接近的结果是:

stats_dic = {}
windows_list = [s1, s2.....]

def update_recursive(s,d):
    for k, v in s.iteritems():
        if isinstance(v, dict):
            update_recursive(v, d)
        else:
            if v in d.keys():
                d[v] += 1
            else:
                d.update({v: 1})
    return d

for window in windows_list():
    stats_dic = update_recursive(window, stats_dic)

这给了我windows1和windows2:

{'windows XP': 1, 'windows 10': 1, '2.7': 1, 'not installed': 2, 'c:\\python27': 1, 'installed': 1}
正如你所看到的,它并没有保持嵌套的形式,而且混合了chrome和mincraft“未安装”的相同值
我尝试过的其他方法要么没有增加计数器,要么只将嵌套形式保持一个深度。我知道我离目标还不远,但我还缺少什么呢?

这里有一个递归函数,它将完成我认为您希望它完成的任务

from pprint import pp # Skip if you're not running Python >= 3.8
def combiner(inp, d=None):
    if d == None:
        d = {}
    for key, value in inp.items():
        if isinstance(value, str):
            x = d.setdefault(key, {})
            x.setdefault(value, 0)
            x[value] += 1
        elif isinstance(value, dict):
            x = d.setdefault(key, {})
            combiner(value, x)
        else:
            raise TypeError("Unexpected type '{}' for 'value'".format(type(value)))
    return d

windows1 = {"version": "windows 10", 
            "installed apps": {"chrome": "installed",
                               "python": {"python version": "2.7", 
                                          "folder": "c:\python27"},
                               "minecraft": "not installed"}}
windows2 = {"version": "windows XP", 
            "installed apps": {"chrome": "not installed",
                               "python": {"python version": "not installed", 
                                          "folder": "c:\python27"},
                               "minecraft": "not installed"}}
windowsList = [windows1, windows2]

x = {}
for comp in windowsList:
    combiner(comp, x)
pp(x) # Use print if you're not running Python >= 3.8
输出:

{'version': {'windows 10': 1, 'windows XP': 1},
 'installed apps': {'chrome': {'installed': 1, 'not installed': 1},
                    'python': {'python version': {'2.7': 1, 'not installed': 1},
                               'folder': {'c:\\python27': 2}},
                    'minecraft': {'not installed': 2}}}

这是一个递归函数,它将执行我认为您希望它执行的操作

from pprint import pp # Skip if you're not running Python >= 3.8
def combiner(inp, d=None):
    if d == None:
        d = {}
    for key, value in inp.items():
        if isinstance(value, str):
            x = d.setdefault(key, {})
            x.setdefault(value, 0)
            x[value] += 1
        elif isinstance(value, dict):
            x = d.setdefault(key, {})
            combiner(value, x)
        else:
            raise TypeError("Unexpected type '{}' for 'value'".format(type(value)))
    return d

windows1 = {"version": "windows 10", 
            "installed apps": {"chrome": "installed",
                               "python": {"python version": "2.7", 
                                          "folder": "c:\python27"},
                               "minecraft": "not installed"}}
windows2 = {"version": "windows XP", 
            "installed apps": {"chrome": "not installed",
                               "python": {"python version": "not installed", 
                                          "folder": "c:\python27"},
                               "minecraft": "not installed"}}
windowsList = [windows1, windows2]

x = {}
for comp in windowsList:
    combiner(comp, x)
pp(x) # Use print if you're not running Python >= 3.8
输出:

{'version': {'windows 10': 1, 'windows XP': 1},
 'installed apps': {'chrome': {'installed': 1, 'not installed': 1},
                    'python': {'python version': {'2.7': 1, 'not installed': 1},
                               'folder': {'c:\\python27': 2}},
                    'minecraft': {'not installed': 2}}}

这是对您的请求的另一种解决方案

答案分为三个部分:

扁平化输入词典 创建表数据帧 计算统计数据并组织输出 要查看整个代码而无需解释步骤,请滚动至最底部。 解释 我说的扁平化输入词典是什么意思?答案很简单:字典不是嵌套的,因此只有一维的键、值对

# Flat dictionary vs. nested dictionary
flat = {'a':1, 'b':2, 'c':3}
nested = {'a':1, 'b':{'c':2, 'd':3}} # 'b' has another dictionary as value
1. 在以后重新创建原始词典的结构时,在键中引用嵌套结构将非常有用

2. 3. 从这里开始,我们可以重新创建原始字典的结构,因为我们在DataFrame列中有这些引用。 以下函数将创建一个嵌套字典,其结构类似于原始字典,统计信息如上所述:

# Recreate structured dictionary
def build_nested(struct, tree, res):
    #
    tree_split = tree.split('.',1)
    
    try:
        struct[tree_split[0]]
        build_nested(struct[tree_split[0]], tree_split[-1], res)
    except KeyError:
        struct[tree_split[0]] = {}
        if len(tree_split) < 2:
            struct[tree_split[0]].update(res)
        else:
            struct[tree_split[0]][tree_split[1]] = {}
            struct[tree_split[0]][tree_split[1]].update(res)
    
    return struct
全部代码
这是对您的请求的另一种解决方案

答案分为三个部分:

扁平化输入词典 创建表数据帧 计算统计数据并组织输出 要查看整个代码而无需解释步骤,请滚动至最底部。 解释 我说的扁平化输入词典是什么意思?答案很简单:字典不是嵌套的,因此只有一维的键、值对

# Flat dictionary vs. nested dictionary
flat = {'a':1, 'b':2, 'c':3}
nested = {'a':1, 'b':{'c':2, 'd':3}} # 'b' has another dictionary as value
1. 在以后重新创建原始词典的结构时,在键中引用嵌套结构将非常有用

2. 3. 从这里开始,我们可以重新创建原始字典的结构,因为我们在DataFrame列中有这些引用。 以下函数将创建一个嵌套字典,其结构类似于原始字典,统计信息如上所述:

# Recreate structured dictionary
def build_nested(struct, tree, res):
    #
    tree_split = tree.split('.',1)
    
    try:
        struct[tree_split[0]]
        build_nested(struct[tree_split[0]], tree_split[-1], res)
    except KeyError:
        struct[tree_split[0]] = {}
        if len(tree_split) < 2:
            struct[tree_split[0]].update(res)
        else:
            struct[tree_split[0]][tree_split[1]] = {}
            struct[tree_split[0]][tree_split[1]].update(res)
    
    return struct
全部代码
你从来没有对k做过任何事。您需要在递归时将带有ks的嵌套dict添加到stats\u dic中,以便获得接近您想要的输出的任何内容。@acushner您好,您的意思是d[k]=在第一个if时更新\u recursivev,d吗?您从未使用k做过任何事情。您需要在递归时将带有ks的嵌套dict添加到stats\u dic中,以获得任何接近您想要的输出。@acushner您好,您的意思是d[k]=update\u recursivev,d第一次如果?正是我需要的,非常感谢。了解d.setdefault和pp,哇!正是我需要的,非常感谢。了解d.setdefault和pp,哇!通过阅读这篇文章,我学到了很多新的东西,这是一个非常聪明的解决问题的方法,它将在未来帮助我。感谢您的详细回答,非常感谢!很高兴听到这有帮助。谢谢你的反馈。通过阅读这篇文章,我学到了很多新的东西,这是一个非常聪明的解决问题的方法,对我以后的工作有帮助。感谢您的详细回答,非常感谢!很高兴听到这有帮助。谢谢你的反馈。