Python中数据集的组织

Python中数据集的组织,python,csv,dictionary,Python,Csv,Dictionary,我有一个包含大量习惯用法的.csv数据集。每一行包含三个我想分开的元素(用逗号分隔): 1) 索引编号(0,1,2,3…) 2) 成语本身 3) 如果习语是肯定的/否定的/中性的 下面是.csv文件的一个小示例: 0,"I did touch them one time you see but of course there was nothing doing, he wanted me.",neutral 1,We find that choice theorists admit that

我有一个包含大量习惯用法的.csv数据集。每一行包含三个我想分开的元素(用逗号分隔):

1) 索引编号(0,1,2,3…)

2) 成语本身

3) 如果习语是肯定的/否定的/中性的

下面是.csv文件的一个小示例:

0,"I did touch them one time you see but of course there was nothing doing, he wanted me.",neutral

1,We find that choice theorists admit that they introduce a style of moral paternalism at odds with liberal values.,neutral

2,"Well, here I am with an olive branch.",positive

3,"Its rudder and fin were both knocked out, and a four-foot-long gash in the shell meant even repairs on the bank were out of the question.",negative
正如你所看到的,有时习语会包含引号,而有时则不会。然而,我认为这并不难分类

我认为在Python中组织这一点的最好方法是通过字典,如下所示:

example_dict = {0: ['This is an idiom.', 'neutral']}
那么,如何将每一行拆分为三个不同的字符串(基于逗号),然后将第一个字符串用作键号,最后两个作为dict中相应的列表项

我最初的想法是尝试用以下代码拆分逗号:

for line in file:    
    new_item = ','.join(line.split(',')[1:])
但它所做的只是删除所有内容,直到一行中的第一个逗号,我不认为通过它进行一系列迭代是有效的

我想得到一些关于这样组织数据的最佳方法的建议。

Python专门致力于处理
csv
文件。在本例中,您可以使用它从文件中创建列表列表。现在让我们调用您的文件
idioms.csv

import csv
with open('idioms.csv', newline='') as idioms_file:
    reader = csv.reader(idioms_file, delimiter=',', quotechar='"')
    idioms_list = [line for line in reader]

# Now you have a list that looks like this:
# [[0, "I did touch them...", "neutral"],
#  [1, "We find that choice...", "neutral"],
#  ...
# ]
现在,您可以根据自己的喜好对数据进行排序或组织