Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/288.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将段落作为一个列表元素阅读_Python_String_List_File - Fatal编程技术网

Python 将段落作为一个列表元素阅读

Python 将段落作为一个列表元素阅读,python,string,list,file,Python,String,List,File,我有一个文本文件,里面有成千上万条评论,如下所示: +1 This book is such a life saver. It has been so helpful to be able to go back to track trends, answer pediatrician questions, or communicate with each other when you are up at different times of the night with a newborn.

我有一个文本文件,里面有成千上万条评论,如下所示:

+1  This book is such a life saver.  It has been so helpful to be able to go back to track trends, answer pediatrician questions, or communicate with each other when you are up at different times of the night with a newborn.  I think it is one of those things that everyone should be required to have before they leave the hospital.  We went through all the pages of the newborn version, then moved to the infant version, and will finish up the second infant book (third total) right as our baby turns 1.  See other things that are must haves for baby at [...]
+1  I bought this a few times for my older son and have bought it again for my newborn. This is super easy to use and helps me keep track of his daily routine. When he started going to the sitter when I went back to work, it helped me know how his day went to better prepare me for how the evening would most likely go. When he was sick, it help me keep track of how many diapers a day he was producing to make sure he was getting dehydrated. The note sections to the side and bottom are useful too because his sitter writes in small notes about whether or not he liked his lunch or if the playtime included going for a walk, etc.Excellent for moms who are wanting to keep track of their kids daily routine even though they are at work. Excellent for dads to keep track as my husband can quickly forget what time he fed our son. LOL
+1  This is great for basics, but I wish the space to write things in was bigger. A lot times I need struggle trying to read what the caretaker wrote in because the spaces go together.
+1  This book is perfect!  I'm a first time new mom, and this book made it so easy to keep track of feedings, diaper changes, sleep.  Definitely would recommend this for new moms.  Plus it's small enough that I throw in the diaper back for doctor visits.
每个评论用新行分隔,评论情绪用选项卡与评论分隔

这是我的代码,它正确地将每个情绪和评论放在各自的数组中:

# read in training data, 18506 reviews
trainingFile = open(r"D:\Desktop\\1565964985_2925534_train_file.data", "r")

# arrays for the sentiments and reviews
sentiment = []
review = []

# for loop that reads each line
for line in trainingFile:
    # data field array separated by tab
    dataFields = line.split("\t")

    # sentiment holds the positive or negative sentiment of the review
    sentiment.append(dataFields[0])
    # review holds the text from the review
    review.append(dataFields[1])

# test print statement
for x in range(len(sentiment)):
    print(sentiment[x])

for x in range(len(review)):
    print(review[x])
问题就在这里:我正在对这些评论段落和当前代码进行大量的讨论

print(review[0])
print(type(review[0]))
count = CountVectorizer()
docs = numpy.array(review[0])
bag = count.fit_transform(docs)
print(bag.toarray())
错误:

TypeError: iteration over a 0-d array    
所以我试着用不同的方式阅读评论:

review.append(dataFields[1].split())
下面是我的结果:

['This', 'book', 'is', 'such', 'a', 'life', 'saver.', 'It', 'has', 'been', 'so', 'helpful', 'to', 'be', 'able', 'to', 'go', 'back', 'to', 'track', 'trends,', 'answer', 'pediatrician', 'questions,', 'or', 'communicate', 'with', 'each', 'other', 'when', 'you', 'are', 'up', 'at', 'different', 'times', 'of', 'the', 'night', 'with', 'a', 'newborn.', 'I', 'think', 'it', 'is', 'one', 'of', 'those', 'things', 'that', 'everyone', 'should', 'be', 'required', 'to', 'have', 'before', 'they', 'leave', 'the', 'hospital.', 'We', 'went', 'through', 'all', 'the', 'pages', 'of', 'the', 'newborn', 'version,', 'then', 'moved', 'to', 'the', 'infant', 'version,', 'and', 'will', 'finish', 'up', 'the', 'second', 'infant', 'book', '(third', 'total)', 'right', 'as', 'our', 'baby', 'turns', '1.', 'See', 'other', 'things', 'that', 'are', 'must', 'haves', 'for', 'baby', 'at', '[...]']
<class 'list'>
[[0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]
输出:

<class 'list'>
['able', 'all', 'and', 'answer', 'are', 'as', 'at', 'baby', 'back', 'be', 'been', 'before', 'book', 'communicate', 'different', 'each', 'everyone', 'finish', 'for', 'go', 'has', 'have', 'haves', 'helpful', 'hospital', 'infant', 'is', 'it', 'leave', 'life', 'moved', 'must', 'newborn', 'night', 'of', 'one', 'or', 'other', 'our', 'pages', 'pediatrician', 'questions', 'required', 'right', 'saver', 'second', 'see', 'should', 'so', 'such', 'that', 'the', 'then', 'they', 'things', 'think', 'third', 'this', 'those', 'through', 'times', 'to', 'total', 'track', 'trends', 'turns', 'up', 'version', 'we', 'went', 'when', 'will', 'with', 'you']
[[1 1 1 1 2 1 2 2 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 1 1 2 1 3 1
  1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 6 1 1 2 1 1 1 1 1 1 5 1 1 1 1 2 2 1 1 1 1
  2 1]]

[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答一个“,”问题“,”必需“,”正确“,”保存“,”第二“,”见“,”应该“,”所以“,”这样“,”那个“,”然后“,”他们“,”事情“,”思考“,”第三“,”这个“,”那个“,”通过“,”时间“,”到“,”总数“,”跟踪“,”趋势“,”结果“,”上升“,”版本“,”我们“,”去了“,”何时
[[1 1 1 1 2 1 2 2 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 1 1 2 1 3 1
1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 6 1 1 2 1 1 1 1 1 1 5 1 1 1 1 2 2 1 1 1 1
2 1]]
所以我的大问题是我如何阅读我的文本到我想要的格式

Full code + outputs:

import re
from sklearn.feature_extraction.text import CountVectorizer
import numpy
import math

# def euclideanDistance


# read in training data, 18506 reviews
trainingFile = open(r"D:\Desktop\\1565964985_2925534_train_file.data", "r")

# arrays for the sentiments and reviews
sentiment = []
review = []

# for loop that reads each line
for line in trainingFile:
    # data field array separated by tab
    dataFields = line.split("\t")

    # sentiment holds the positive or negative sentiment of the review
    sentiment.append(dataFields[0])
    # review holds the text from the review
    review.append(dataFields[1].split())

# test print statement
#for x in range(len(sentiment)):
   # print(sentiment[x])

#for x in range(len(review)):
   # print(review[x])

print(review[0])
print(type(review[0]))
count = CountVectorizer()
docs = numpy.array(review[0])
bag = count.fit_transform(docs)
print(bag.toarray())
print("\n\n\n")

review[0] = ["This book is such a life saver.  It has been so helpful to be able to go back to track trends, answer"
             " pediatrician questions, or communicate with each other when you are up at different times of the night"
             " with a newborn.  I think it is one of those things that everyone should be required to have before they"
             " leave the hospital.  We went through all the pages of the newborn version, then moved to the infant"
             " version, and will finish up the second infant book (third total) right as our baby turns 1."
             "  See other things that are must haves for baby at [...]"]
print(type(review[0]))
count = CountVectorizer()
docs = numpy.array(review[0])
bag = count.fit_transform(docs)
print(count.get_feature_names())
print(bag.toarray())

trainingFile.close()

['This', 'book', 'is', 'such', 'a', 'life', 'saver.', 'It', 'has', 'been', 'so', 'helpful', 'to', 'be', 'able', 'to', 'go', 'back', 'to', 'track', 'trends,', 'answer', 'pediatrician', 'questions,', 'or', 'communicate', 'with', 'each', 'other', 'when', 'you', 'are', 'up', 'at', 'different', 'times', 'of', 'the', 'night', 'with', 'a', 'newborn.', 'I', 'think', 'it', 'is', 'one', 'of', 'those', 'things', 'that', 'everyone', 'should', 'be', 'required', 'to', 'have', 'before', 'they', 'leave', 'the', 'hospital.', 'We', 'went', 'through', 'all', 'the', 'pages', 'of', 'the', 'newborn', 'version,', 'then', 'moved', 'to', 'the', 'infant', 'version,', 'and', 'will', 'finish', 'up', 'the', 'second', 'infant', 'book', '(third', 'total)', 'right', 'as', 'our', 'baby', 'turns', '1.', 'See', 'other', 'things', 'that', 'are', 'must', 'haves', 'for', 'baby', 'at', '[...]']
<class 'list'>
[[0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]




<class 'list'>
['able', 'all', 'and', 'answer', 'are', 'as', 'at', 'baby', 'back', 'be', 'been', 'before', 'book', 'communicate', 'different', 'each', 'everyone', 'finish', 'for', 'go', 'has', 'have', 'haves', 'helpful', 'hospital', 'infant', 'is', 'it', 'leave', 'life', 'moved', 'must', 'newborn', 'night', 'of', 'one', 'or', 'other', 'our', 'pages', 'pediatrician', 'questions', 'required', 'right', 'saver', 'second', 'see', 'should', 'so', 'such', 'that', 'the', 'then', 'they', 'things', 'think', 'third', 'this', 'those', 'through', 'times', 'to', 'total', 'track', 'trends', 'turns', 'up', 'version', 'we', 'went', 'when', 'will', 'with', 'you']
[[1 1 1 1 2 1 2 2 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 1 1 2 1 3 1
  1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 6 1 1 2 1 1 1 1 1 1 5 1 1 1 1 2 2 1 1 1 1
  2 1]]
完整代码+输出:
进口稀土
从sklearn.feature\u extraction.text导入countvectorier
进口numpy
输入数学
#def欧几里德立场
#读入培训数据,18506回顾
trainingFile=open(r“D:\Desktop\\1565964985\u 2925534\u train\u file.data”,“r”)
#情绪和评论的数组
情绪=[]
回顾=[]
#读取每行的for循环
对于培训文件中的行:
#由制表符分隔的数据字段数组
数据字段=行分割(“\t”)
#情绪包含评论的积极或消极情绪
情绪。追加(数据字段[0])
#review保存来自review的文本
review.append(数据字段[1].split())
#测试打印语句
#对于范围内的x(len(情绪)):
#打印(情绪[x])
#对于范围内的x(len(审查)):
#打印(审阅[x])
打印(审阅[0])
打印(类型(审阅[0]))
count=CountVectorizer()
docs=numpy.array(查看[0])
袋子=计数。适合转换(文档)
打印(bag.toarray())
打印(“\n\n\n”)
回顾[0]=[“这本书真是个救命稻草。能够追溯趋势非常有帮助,请回答”
“儿科医生的问题,或在夜间不同时间起床时相互交流”
“有了新生儿。我认为这是每个人在出生前都应该拥有的东西之一”
“离开医院。我们浏览了新生儿版的所有页面,然后转到婴儿版”
“版本,并将在宝宝1岁时完成第二本婴儿书(共三本)。”
“在[…]查看婴儿必备的其他物品”]
打印(类型(审阅[0]))
count=CountVectorizer()
docs=numpy.array(查看[0])
袋子=计数。适合转换(文档)
打印(count.get_feature_names())
打印(bag.toarray())
trainingFile.close()
这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这本书,这“思考”、“它”、“是”、“一”、“的”、“那些”、“事情”、“那”、“每个人”、“应该”、“是”、“必须”、“必须”、“拥有”、“之前”、“他们”、“离开”、“医院”、“我们”、“经过”、“全部”、“页面”、“的”、“新生儿”、“版本”、“然后”、“移动”、“到”、“婴儿”、“版本”、“和”、“将”、“完成”、“结束”、“第二次”,‘婴儿’、‘书’、‘第三’、‘总计’、‘右’、‘as’、‘我们的’、‘婴儿’、‘转身’、‘1’、‘看’、‘其他’、‘东西’、‘那’、‘是’、‘必须’、‘有’、‘for’、‘婴儿’、‘at’、‘[…]”
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答]、[答一个“,”问题“,”必需“,”正确“,”保存“,”第二“,”见“,”应该“,”所以“,”这样“,”那个“,”然后“,”他们“,”事情“,”思考“,”第三“,”这个“,”那个“,”通过“,”时间“,”到“,”总数“,”跟踪“,”趋势“,”结果“,”上升“,”版本“,”我们“,”去了“,”何时
[[1 1 1 1 2 1 2 2 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 1 1 2 1 3 1
1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 6 1 1 2 1 1 1 1 1 1 5 1 1 1 1 2 2 1 1 1 1
2 1]]

假设您的情绪值可以是
[+1,0,-1]
中的一个,下面的代码应该可以实现这一点

资料 我修改了你的样本数据,以确保所有类型的情绪值都存在其中

data\u string=“”
-1这本书真是个救命稻草。当你和新生儿在一个晚上的不同时间起床时,能够追溯趋势,回答儿科医生的问题,或者相互交流,这是非常有帮助的。我认为这是每个人在离开医院之前都应该具备的东西之一。我们经历了新生儿版的所有页面,然后移到婴儿版,并将在婴儿满1岁时完成第二本婴儿书(共三本)。在[…]查看婴儿必备的其他物品
+我为我的大儿子买了几次,也为我的新生儿买了一次。这非常容易使用,帮助我跟踪他的日常生活。当他开始去医院的时候
Full code + outputs:

import re
from sklearn.feature_extraction.text import CountVectorizer
import numpy
import math

# def euclideanDistance


# read in training data, 18506 reviews
trainingFile = open(r"D:\Desktop\\1565964985_2925534_train_file.data", "r")

# arrays for the sentiments and reviews
sentiment = []
review = []

# for loop that reads each line
for line in trainingFile:
    # data field array separated by tab
    dataFields = line.split("\t")

    # sentiment holds the positive or negative sentiment of the review
    sentiment.append(dataFields[0])
    # review holds the text from the review
    review.append(dataFields[1].split())

# test print statement
#for x in range(len(sentiment)):
   # print(sentiment[x])

#for x in range(len(review)):
   # print(review[x])

print(review[0])
print(type(review[0]))
count = CountVectorizer()
docs = numpy.array(review[0])
bag = count.fit_transform(docs)
print(bag.toarray())
print("\n\n\n")

review[0] = ["This book is such a life saver.  It has been so helpful to be able to go back to track trends, answer"
             " pediatrician questions, or communicate with each other when you are up at different times of the night"
             " with a newborn.  I think it is one of those things that everyone should be required to have before they"
             " leave the hospital.  We went through all the pages of the newborn version, then moved to the infant"
             " version, and will finish up the second infant book (third total) right as our baby turns 1."
             "  See other things that are must haves for baby at [...]"]
print(type(review[0]))
count = CountVectorizer()
docs = numpy.array(review[0])
bag = count.fit_transform(docs)
print(count.get_feature_names())
print(bag.toarray())

trainingFile.close()

['This', 'book', 'is', 'such', 'a', 'life', 'saver.', 'It', 'has', 'been', 'so', 'helpful', 'to', 'be', 'able', 'to', 'go', 'back', 'to', 'track', 'trends,', 'answer', 'pediatrician', 'questions,', 'or', 'communicate', 'with', 'each', 'other', 'when', 'you', 'are', 'up', 'at', 'different', 'times', 'of', 'the', 'night', 'with', 'a', 'newborn.', 'I', 'think', 'it', 'is', 'one', 'of', 'those', 'things', 'that', 'everyone', 'should', 'be', 'required', 'to', 'have', 'before', 'they', 'leave', 'the', 'hospital.', 'We', 'went', 'through', 'all', 'the', 'pages', 'of', 'the', 'newborn', 'version,', 'then', 'moved', 'to', 'the', 'infant', 'version,', 'and', 'will', 'finish', 'up', 'the', 'second', 'infant', 'book', '(third', 'total)', 'right', 'as', 'our', 'baby', 'turns', '1.', 'See', 'other', 'things', 'that', 'are', 'must', 'haves', 'for', 'baby', 'at', '[...]']
<class 'list'>
[[0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]




<class 'list'>
['able', 'all', 'and', 'answer', 'are', 'as', 'at', 'baby', 'back', 'be', 'been', 'before', 'book', 'communicate', 'different', 'each', 'everyone', 'finish', 'for', 'go', 'has', 'have', 'haves', 'helpful', 'hospital', 'infant', 'is', 'it', 'leave', 'life', 'moved', 'must', 'newborn', 'night', 'of', 'one', 'or', 'other', 'our', 'pages', 'pediatrician', 'questions', 'required', 'right', 'saver', 'second', 'see', 'should', 'so', 'such', 'that', 'the', 'then', 'they', 'things', 'think', 'third', 'this', 'those', 'through', 'times', 'to', 'total', 'track', 'trends', 'turns', 'up', 'version', 'we', 'went', 'when', 'will', 'with', 'you']
[[1 1 1 1 2 1 2 2 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 1 1 2 1 3 1
  1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 6 1 1 2 1 1 1 1 1 1 5 1 1 1 1 2 2 1 1 1 1
  2 1]]
['-1  This book is such a life saver.  It has been so helpful to be able to go back to track trends, answer pediatrician questions, or communicate with each other when you are up at different times of the night with a newborn.  I think it is one of those things that everyone should be required to have before they leave the hospital.  We went through all the pages of the newborn version, then moved to the infant version, and will finish up the second infant book (third total) right as our baby turns 1.  See other things that are must haves for baby at [...]',
 '+1  I bought this a few times for my older son and have bought it again for my newborn. This is super easy to use and helps me keep track of his daily routine. When he started going to the sitter when I went back to work, it helped me know how his day went to better prepare me for how the evening would most likely go. When he was sick, it help me keep track of how many diapers a day he was producing to make sure he was getting dehydrated. The note sections to the side and bottom are useful too because his sitter writes in small notes about whether or not he liked his lunch or if the playtime included going for a walk, etc.Excellent for moms who are wanting to keep track of their kids daily routine even though they are at work. Excellent for dads to keep track as my husband can quickly forget what time he fed our son. LOL',
 '0  This is great for basics, but I wish the space to write things in was bigger. A lot times I need struggle trying to read what the caretaker wrote in because the spaces go together.',
 "+1  This book is perfect!  I'm a first time new mom, and this book made it so easy to keep track of feedings, diaper changes, sleep.  Definitely would recommend this for new moms.  Plus it's small enough that I throw in the diaper back for doctor visits."]
line:0 
     sentiment: -1 
     review: This book is such a life saver.  It has been so helpful to be able to go back to track trends, answer pediatrician questions, or communicate with each other when you are up at different times of the night with a newborn.  I think it is one of those things that everyone should be required to have before they leave the hospital.  We went through all the pages of the newborn version, then moved to the infant version, and will finish up the second infant book (third total) right as our baby turns 1.  See other things that are must haves for baby at [...]
line:1 
     sentiment: +1 
     review: I bought this a few times for my older son and have bought it again for my newborn. This is super easy to use and helps me keep track of his daily routine. When he started going to the sitter when I went back to work, it helped me know how his day went to better prepare me for how the evening would most likely go. When he was sick, it help me keep track of how many diapers a day he was producing to make sure he was getting dehydrated. The note sections to the side and bottom are useful too because his sitter writes in small notes about whether or not he liked his lunch or if the playtime included going for a walk, etc.Excellent for moms who are wanting to keep track of their kids daily routine even though they are at work. Excellent for dads to keep track as my husband can quickly forget what time he fed our son. LOL
line:2 
     sentiment: 0 
     review: This is great for basics, but I wish the space to write things in was bigger. A lot times I need struggle trying to read what the caretaker wrote in because the spaces go together.
line:3 
     sentiment: +1 
     review: This book is perfect!  I'm a first time new mom, and this book made it so easy to keep track of feedings, diaper changes, sleep.  Definitely would recommend this for new moms.  Plus it's small enough that I throw in the diaper back for doctor visits.