Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/list/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 根据列表中的某些项目对列表中的项目进行分组_Python_List_Group By_Grouping - Fatal编程技术网

Python 根据列表中的某些项目对列表中的项目进行分组

Python 根据列表中的某些项目对列表中的项目进行分组,python,list,group-by,grouping,Python,List,Group By,Grouping,我有一个包含两个元素的列表:公司id和组号。我想以这样一种方式根据不同列表中的组编号对这些公司进行分组,这样我就可以对每个单独的组进行一些回归。我的名单如下: 59872004 0 74202004 0 1491772004 1 1476392004 1 309452004 1 1171452004 1 150842004 2 143592004 2 76202004 2 119232004 2 80492004 2 291732004 2 我

我有一个包含两个元素的列表:公司id和组号。我想以这样一种方式根据不同列表中的组编号对这些公司进行分组,这样我就可以对每个单独的组进行一些回归。我的名单如下:

59872004    0
74202004    0
1491772004  1
1476392004  1
309452004   1
1171452004  1
150842004   2
143592004   2
76202004    2
119232004   2
80492004    2
291732004   2
我当前的代码如下:

list_of_variables = []
with open(str(csv_path) + "2004-297-100.csv", 'r') as csvFile:
    reader = csv.reader(csvFile)
    for row in reader:
        list_of_variables.append(row)
    del list_of_variables[0]

list_of_lists = []
counter = 0
counter_list = 0
one_cluster = []
variable = []
for line in list_of_variables:
    print('counter: ', counter_list)
    # for testing purposes
    if counter_list == 20:
        break
    # print("cluster: ", cluster)
    # append the first line from the list to the intermediary list
    if counter_list == 0:
        one_cluster.append(line)
    if counter_list >= 1:
        if line[1] == variable[1]:
            one_cluster.append(line)
    print("one cluster : ", one_cluster)
    variable = one_cluster[counter-1]
    # print('line : ', line[1])
    # print('variable : ', variable[1])
    counter += 1
    # if the grouped number changed put the list into the final list
    # clear the intermediary list and append the current element which was not part of the previous group
    if line[1] != variable[1]:
        list_of_lists.append(one_cluster.copy())
        # print("here", list_of_lists)
        one_cluster.clear()
        one_cluster.append(line)
        counter = 0
    # print('variable', variable)
    # print('one_cluster ', one_cluster)
    counter_list += 1


print(list_of_lists)
该代码的输出如下所示:

list_of_variables = []
with open(str(csv_path) + "2004-297-100.csv", 'r') as csvFile:
    reader = csv.reader(csvFile)
    for row in reader:
        list_of_variables.append(row)
    del list_of_variables[0]

list_of_lists = []
counter = 0
counter_list = 0
one_cluster = []
variable = []
for line in list_of_variables:
    print('counter: ', counter_list)
    # for testing purposes
    if counter_list == 20:
        break
    # print("cluster: ", cluster)
    # append the first line from the list to the intermediary list
    if counter_list == 0:
        one_cluster.append(line)
    if counter_list >= 1:
        if line[1] == variable[1]:
            one_cluster.append(line)
    print("one cluster : ", one_cluster)
    variable = one_cluster[counter-1]
    # print('line : ', line[1])
    # print('variable : ', variable[1])
    counter += 1
    # if the grouped number changed put the list into the final list
    # clear the intermediary list and append the current element which was not part of the previous group
    if line[1] != variable[1]:
        list_of_lists.append(one_cluster.copy())
        # print("here", list_of_lists)
        one_cluster.clear()
        one_cluster.append(line)
        counter = 0
    # print('variable', variable)
    # print('one_cluster ', one_cluster)
    counter_list += 1


print(list_of_lists)
[59872004,0'],[74202004,0']],[1491772004,1'],[309452004,1'],[1171452004,1']],[150842004,2'],[76202004,2'],[119232004,2'],[80492004,2'],[291732004,2']]

代码的预期输出:

[59872004]、[0']、[74202004]、[1491772004]、[1']、[1476392004]、[1']、[309452004]、[1']、[1171452004]、[150842004]、[2']、[143592004]、[76202004]、[2']、[119232004]、[2']、[80492004]、[2']、[291732004]、[2']

如果你仔细看一看,归零集团的做法是正确的,但其他所有集团都缺少公司。例如,第1组应该有4个元素,但我的代码只输出3个元素,以此类推。我环顾四周,但没有找到更容易做到这一点的方法。如果你知道如何解决这个问题,或者为我指出正确的方向,我将非常感激

谢谢你的时间和耐心


更新:我已将列表从图片更改为可以复制的内容。并添加了预期的输出。

我找到了问题的答案。如果我切断线路

variable=one_集群[counter-1]并将其置于

if counter_list >= 1:
        if line[1] == variable[1]:
            one_cluster.append(line)
要在for循环中获取以下代码:

for line in list_of_variables:
print('counter: ', counter_list)
if counter_list == 50:
    break
# print("cluster: ", cluster)
if counter_list == 0:
    one_cluster.append(line)
variable = one_cluster[counter - 1]
if counter_list >= 1:
    if line[1] == variable[1]:
        one_cluster.append(line)
print("one cluster : ", one_cluster)

# print('line : ', line[1])
# print('variable : ', variable[1])
counter += 1
if line[1] != variable[1]:
    list_of_lists.append(one_cluster.copy())
    # print("here", list_of_lists)
    one_cluster.clear()
    one_cluster.append(line)
    counter = 0
# print('variable', variable)
# print('one_cluster ', one_cluster)
counter_list += 1

然后一切按预期进行。我为此挣扎了很长一段时间,然后我突然想到了这个想法。。。然而,如果有人有一个更容易做到这一点,我愿意接受建议

您的代码过于复杂。如果您的目标是根据csv文件的第二列对所有这些公司进行分组,只需在读取该文件后添加以下代码:

from collections import defaultdict

grouping = defaultdict(list)

for line in list_of_variables:
    grouping[line[1]].append(line[0])
现在,如果您想使用一组元素,比如说组1,只需运行它:

for company in grouping[1]:

这两个列表是分开的吗?您能否将该列表作为代码块提供,以便人们可以轻松地复制数据样本并使用它?作为一个形象,这确实是不可行的。同样,描述您所追求的逻辑也不会有什么坏处,因为第一段没有明确说明,如果代码中的逻辑不能满足您所期望的输出,那么可能无法信任代码中的逻辑。。。您的预期输出也会很好:)组号排序了吗?@TeraBaapBC是的,数据排序在组号之后