Python 从每行json数据返回**唯一**计数_Python_Python 3.x

Python 从每行json数据返回**唯一**计数

python python-3.x

Python 从每行json数据返回**唯一**计数,python,python-3.x,Python,Python 3.x,我已经在这方面工作了一段时间，似乎无法摆脱它：我有一块JSON数据，看起来像这样 0 [{'code': '8', 'name': 'Human development'}, {'code': '8', 'name': 'Human development'} 1 [{'code': '1', 'name': 'Economic management'},{'code': '8', 'name': 'Human development'} 2 [{'code': '5', 'n

我已经在这方面工作了一段时间，似乎无法摆脱它：我有一块JSON数据，看起来像这样

0    [{'code': '8', 'name': 'Human development'}, {'code': '8', 'name': 'Human development'}
1    [{'code': '1', 'name': 'Economic management'},{'code': '8', 'name': 'Human development'}
2    [{'code': '5', 'name': 'Trade and integration'},{'code': '1', 'name': 'Economic management'}
3    [{'code': '7', 'name': 'Social dev/gender/inclusion'}]

我试图为每个值生成一个计数，最后是这样的：

Human development : 2
Economic management : 2
Trade and integration : 1
Social dev/gender/inclusion : 1

注意：有些行被编码了两次（与第一行类似），只应计数一次

我尝试过很多不同的事情，最接近的就是这个

for i in range(0,len(wbp['code'])):
# create a counter for the next step, counting the number of values of each subdict
number = len(wbp['code'][i])-1

#create empty values
dictd = dict()
lis = [] 

#iterate across the sublist 
for j in range (0,number):
    temp_list=[]
    temp_list.append(wbp['code'][i][int(j)]['name'])
    #using set to return only unique values
    lis = tuple(set(temp_list))
    if lis in dictd.keys():
        dictd[lis]+=1
    else:
        dictd[lis]=1
    #lis.append(temp_list)
    #value=[[x,lis.count(x)] for x in lis]
print(dictd)

{('Human development',): 1}
{('Economic management',): 1}
{('Trade and integration',): 1, ('Public sector governance',): 1, ('Environment and natural resources management',): 1}
{('Social dev/gender/inclusion',): 1}
{('Trade and integration',): 1}
{('Social protection and risk management',): 1}
{('Public sector governance',): 1}
{('Environment and natural resources management',): 1}
{('Rural development',): 1}
{('Public sector governance',): 2}
{('Rural development',): 1}
{('Rural development',): 1, ('Social protection and risk management',): 2}
{}
{('Trade and integration',): 1, ('Environment and natural resources management',): 1}
{('Social protection and risk management',): 2}
{('Rural development',): 1, ('Environment and natural resources management',): 1}
{('Rural development',): 1}
{('Human development',): 1}

这是不对的，因为它不是内部指令之外的工作计数器，这不是我想要的。我所能想到的就是一定有一种更疯狂的方式来做这件事

编辑：似乎我在清晰性方面做得很差：数据集中同样存在错误，因为像第0行这样的条目有重复项这些不应被计算两次。人类发展的预期回报应该是2，而不是3，因为第一行是错误的

无法理解

temp_list

在每次迭代时都作为空列表创建的第二个循环，那么为什么需要执行此

lis=tuple（set（temp_list））

而是在变量

名称中读取它：
name = wbp['code'][i][int(j)]['name']

if name in dictd.keys():
    dictd[name]+=1
else:
    dictd[name]=1

由于输入细节不清楚，我假设您的输入如下，并附带以下代码：
    wbp = [[{'code': '8', 'name': 'Human development'}, {'code': '8', 'name': 'Human development'}],
       [{'code': '1', 'name': 'Economic management'}, {'code': '8', 'name': 'Human development'}],
       [{'code': '5', 'name': 'Trade and integration'}, {'code': '1', 'name': 'Economic management'}],
       [{'code': '7', 'name': 'Social dev/gender/inclusion'}]]

dictd = dict()

    for record in wbp:
        names = set([item['name'] for item in record]) # Remove duplicate names using set
        for name in names:
            dictd[name] = dictd.get(name, 0) + 1  # If name not found, then 0 + 1, else count + 1

    print(dictd)

导致
{
"经济管理":二,
“社会发展/性别/包容”：1
"人的发展":二,
“贸易与一体化”：1
}
我不明白你的例子。给我一个输入和一个预期输出代码'Human development'
的计数应该是3吗？您的输入json是这样的吗<代码>{[{'code'：'8'，'name'：'Human development'}，{'code'：'8'，'name'：'Human development'}，{'code'：'1'，'name'：'Economic management'}，{'code'：'8'，'name'：'Human development'}，{'code'：'1'，'name'：'Economic management'}，{'[{'code'：'7'，'name'：'Social dev/gender/inclusion'}]}

@SwadhikarC，否，计数应该是2，因为第一个版本是错误的。任何条目都不能对值进行两次计数。是的，输入看起来是这样的（反正是它的子字符串）Ike很高兴它起了作用，这就是信用。是的，在数据结构上，问题是人类发展应该是2，因为第一行是一个错误（每行只能有一个项目）好的，我已经使用

set

操作符修改了输入并过滤了重复条目。现在的结果与您预期的一样。这似乎重新出现了一些方法，它当时起了很大的作用。问题是重复错误，这正是我试图解释的；请看，第一行应该是正确的[{'code'：'8'，'name'：'Human development'}]而不是[[{'code'：'8'，'name'：'Human development'}，[{'code'：'8'，'name'：'Human development'}，]但既然是这样，我就必须为每一行创建一个唯一的集合。虽然肯定有更简单的方法来解决这个问题？见上面的评论，这很接近，但忽略了重复问题。问题是重复项是错误的，就像第一行一样。

    wbp = [[{'code': '8', 'name': 'Human development'}, {'code': '8', 'name': 'Human development'}],
       [{'code': '1', 'name': 'Economic management'}, {'code': '8', 'name': 'Human development'}],
       [{'code': '5', 'name': 'Trade and integration'}, {'code': '1', 'name': 'Economic management'}],
       [{'code': '7', 'name': 'Social dev/gender/inclusion'}]]

dictd = dict()

    for record in wbp:
        names = set([item['name'] for item in record]) # Remove duplicate names using set
        for name in names:
            dictd[name] = dictd.get(name, 0) + 1  # If name not found, then 0 + 1, else count + 1

    print(dictd)