Python 将CSV行转换为字典_Python_Csv_Pandas

Python 将CSV行转换为字典

python csv pandas

Python 将CSV行转换为字典,python,csv,pandas,Python,Csv,Pandas,我正在分析包含用户评论的大数据文件，我被要求将每一行转换为字典中的关键字（单词）和值（该行的单词数/评论），以分析单词的用法使用下面的代码，我能够分割数据，但无法将其转换为字典 import csv import pandas as pd products = pd.read_csv('product_comments.csv') products['words_count'] = csv.DictReader(products['review'].str.lower().str.split

我正在分析包含用户评论的大数据文件，我被要求将每一行转换为字典中的关键字（单词）和值（该行的单词数/评论），以分析单词的用法

使用下面的代码，我能够分割数据，但无法将其转换为字典

import csv
import pandas as pd

products = pd.read_csv('product_comments.csv')
products['words_count'] = csv.DictReader(products['review'].str.lower().str.split())

请帮我解决这个问题。

您可以将

计数器应用到评论列，以获得词频的词典
基于unix
单词列表的随机示例：
word_file = "/usr/share/dict/words"
words = open(word_file).read().splitlines()[10:50]
random_word_list = [[' '.join(np.random.choice(words, size=100, replace=True))] for i in range(50)]

df.head()

                                             reviews
0  abaculus abacinate abalienate abaff abalone ab...
1  abalienation abacus abaction abacination abaca...
2  Ababdeh abalienate abaiser abaff abaca abactin...
3  abaction Aaru abandonee abalienate Aaronic aba...
4  abandon abampere abactor abactor abandon abacu...

在空格上拆分并使用内置的集合使用DataFrame.apply（）
。计数器
：
from collections import Counter
df.reviews.str.split(' ').apply(lambda x: Counter(x))

你会得到：
0     {'Ababua': 5, 'abandon': 7, 'abaction': 3, 'ab...
1     {'Aaronical': 3, 'abandon': 1, 'abaction': 4, ...
2     {'Aaronical': 5, 'Ababua': 1, 'abaction': 1, '...
3     {'Aaronical': 3, 'abandon': 1, 'abaction': 7, ...
4     {'Aaronical': 4, 'abandon': 2, 'abaction': 2, ...

您可以将计数器应用于评论列，以获得词频的词典
基于unix
单词列表的随机示例：
word_file = "/usr/share/dict/words"
words = open(word_file).read().splitlines()[10:50]
random_word_list = [[' '.join(np.random.choice(words, size=100, replace=True))] for i in range(50)]

df.head()

                                             reviews
0  abaculus abacinate abalienate abaff abalone ab...
1  abalienation abacus abaction abacination abaca...
2  Ababdeh abalienate abaiser abaff abaca abactin...
3  abaction Aaru abandonee abalienate Aaronic aba...
4  abandon abampere abactor abactor abandon abacu...

在空格上拆分并使用内置的集合使用DataFrame.apply（）
。计数器
：
from collections import Counter
df.reviews.str.split(' ').apply(lambda x: Counter(x))

你会得到：
0     {'Ababua': 5, 'abandon': 7, 'abaction': 3, 'ab...
1     {'Aaronical': 3, 'abandon': 1, 'abaction': 4, ...
2     {'Aaronical': 5, 'Ababua': 1, 'abaction': 1, '...
3     {'Aaronical': 3, 'abandon': 1, 'abaction': 7, ...
4     {'Aaronical': 4, 'abandon': 2, 'abaction': 2, ...

向我们显示您从csv文件读取的数据。请正确编辑您的代码。csv。DictReader
用于对文本文件进行操作。不是熊猫数据结构。我有两个列-名称和评论。名称栏有产品名称，如1）Planetwise擦拭袋，2）Annas Dream Full被子，2）Shams，3）停止奶嘴吸吮，不含眼泪，用拇指吸吮爱的宾基仙女木偶和可爱的书等，在评论栏中我有客户评论，如-1）来得早，并不失望。我喜欢行星智能袋，现在是我的擦拭架。它使我的舒适湿巾湿润，不会渗漏。强烈推荐。我需要添加新的“products['words_count']”列，该列为字典类型，单词的关键字和值计数显示您从csv文件读取的数据。请正确编辑您的代码。csv。DictReader
用于对文本文件进行操作。不是熊猫数据结构。我有两个列-名称和评论。名称栏有产品名称，如1）Planetwise擦拭袋，2）Annas Dream Full被子，2）Shams，3）停止奶嘴吸吮，不含眼泪，用拇指吸吮爱的宾基仙女木偶和可爱的书等，在评论栏中我有客户评论，如-1）来得早，并不失望。我喜欢行星智能袋，现在是我的擦拭架。它使我的舒适湿巾湿润，不会渗漏。强烈推荐。我需要添加新的列“products['words_count']”，它是字典类型、关键字和单词的值计数