Python标记化单词_Python_List_Csv_Nltk_Tokenize

Python标记化单词

python list csv

Python标记化单词,python,list,csv,nltk,tokenize,Python,List,Csv,Nltk,Tokenize,我只想将csv文件中的行标记化，出现以下错误： “列表索引必须是整数或片，而不是str”我相信您的csv文件看起来是这样的： summaries = [] texts = [] with open("C:\\Users\\apandey\\Documents\\Reviews.csv","r",encoding="utf8") as csvfile: reader = csv.reader(csvfile) for row in reader: clean_te

我只想将csv文件中的行标记化，出现以下错误：

“列表索引必须是整数或片，而不是str”

我相信您的csv文件看起来是这样的：

summaries = []
texts = []
with open("C:\\Users\\apandey\\Documents\\Reviews.csv","r",encoding="utf8") as csvfile: 
    reader = csv.reader(csvfile)
    for row in reader:
        clean_text = clean(row['Text'])
        clean_summary = clean(row['Summary'])
        summaries.append(word_tokenize(clean_summary))
        texts.append(word_tokenize(clean_text))

然后你们应该按照彼得·伍德在评论部分的建议使用听写器

Id,ProductId,UserId,ProfileName,HelpfulnessNumerator,HelpfulnessDenominator,Score,Time,Summary,Text
1,'B001E4KFG0','A3SGXH7AUHU8GW','delmartian',1,1,5,1303862400,'Good Quality Dog 
Food','I have bought several of the Vitality canned dog food products and have 
found them all to be of good quality...'

输出：

summaries = []
texts = []
with open("foo.csv",encoding="utf8", newline='') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        clean_text = row["Text"]
        clean_summary = row["Summary"]
        summaries.append(word_tokenize(clean_summary))
        texts.append(word_tokenize(clean_text))

我认为这是一个列表。您无法使用字符串索引访问列表中的元素。您可能需要嘿，谢谢。。但是，当我使用dictreader时，它给了我一个错误：“预期的字符串或字节，如object”@peter WoodPlease粘贴一个评论样本.csv['Id'、'ProductId'、'UserId'、'ProfileName'、'HelpfulnessNumerator'、'HelpfulnessDenominator'、'Score'、'Time'、'Summary'、'Text']我买了好几种Vitality罐装狗粮，发现它们的质量都很好。这种产品看起来更像炖肉，而不是加工过的肉，闻起来更香。我的拉布拉多犬很挑剔，她很欣赏这种产品ct比大多数都好。“]这只是一行

# texts
[["'I", 'have', 'bought', 'several', 'of', 'the', 'Vitality', 'canned', 'dog', 'food', 'products', 'and', 'have', 'found', 'them', 'all', 'to', 'be', 'of', 'good', 'quality', '.', 'The', 'product', 'looks', 'more', 'like', 'a', 'stew', 'than', 'a', 'processed', 'meat', 'and', 'it', 'smells', 'better', '.', 'My', 'Labrador', 'is', 'finicky', 'and', 'she', 'appreciates', 'this', 'product', 'better', 'than', 'most', '.', "'"]]

# summaries
[["'Good", 'Quality', 'Dog', 'Food', "'"]]