Python 面对属性错误:';列表';对象没有属性';下';
我已经发布了我的样本列车数据以及测试数据以及我的代码。我尝试使用朴素贝叶斯算法来训练模型 但是,在评论中我得到了一个列表。因此,我认为我的代码出现以下错误:Python 面对属性错误:';列表';对象没有属性';下';,python,scikit-learn,Python,Scikit Learn,我已经发布了我的样本列车数据以及测试数据以及我的代码。我尝试使用朴素贝叶斯算法来训练模型 但是,在评论中我得到了一个列表。因此,我认为我的代码出现以下错误: return lambda x: strip_accents(x.lower()) AttributeError: 'list' object has no attribute 'lower' 你们谁能帮我解决这个问题,因为我是python新手 train.txt: test.txt: 我的代码: 您需要遍历列表中的每个元素 for it
return lambda x: strip_accents(x.lower())
AttributeError: 'list' object has no attribute 'lower'
你们谁能帮我解决这个问题,因为我是python新手
train.txt:
test.txt:
我的代码:
您需要遍历列表中的每个元素
for item in list():
item = item.lower()
注意:只有在遍历字符串列表(dtype=str)时才适用。我对代码进行了一些修改。贴在下面的那一个工作;我添加了关于如何调试上面发布的一个的注释
# These three will not used, do not import them
# from sklearn.preprocessing import MultiLabelBinarizer
# from sklearn.model_selection import train_test_split
# from sklearn.metrics import confusion_matrix
# This performs the classification task that you want with your input data in the format provided
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
def load_data(filename):
""" This function works, but you have to modify the second-to-last line from
reviews.append(line[0].split()) to reviews.append(line[0]).
CountVectorizer will perform the splits by itself as it sees fit, trust him :)"""
reviews = list()
labels = list()
with open(filename) as file:
file.readline()
for line in file:
line = line.strip().split(',')
labels.append(line[1])
reviews.append(line[0])
return reviews, labels
X_train, y_train = load_data('train.txt')
X_test, y_test = load_data('test.txt')
vec = CountVectorizer()
# Notice: clf means classifier, not vectorizer.
# While it is syntactically correct, it's bad practice to give misleading names to your objects.
# Replace "clf" with "vec" or something similar.
# Important! you called only the fit method, but did not transform the data
# afterwards. The fit method does not return the transformed data by itself. You
# either have to call .fit() and then .transform() on your training data, or just fit_transform() once.
X_train_transformed = vec.fit_transform(X_train)
X_test_transformed = vec.transform(X_test)
clf= MultinomialNB()
clf.fit(X_train_transformed, y_train)
score = clf.score(X_test_transformed, y_test)
print("score of Naive Bayes algo is :" , score)
此代码的输出为:
score of Naive Bayes algo is : 0.5
lower()
不是列表的属性。尝试将其转换为numpy
数组。然后.lower()
应该可以工作。在代码中返回lambda x:strip\u重音(x.lower())
在哪里?这是一段简单的代码。你可以自己管理它,你能在这里编辑我的代码吗。因为,我是python的新手。评论包含listHi@Daniel R的列表。我们如何计算精度和召回率?Mhmh,这种格式的数据似乎允许从sklearn.metrics计算混淆矩阵,但精度和召回率都不允许。我将对此进行研究,但现在您可以打印混淆矩阵并从中手动计算它们,方法是添加:y\u pred=clf.predict(X\u test\u transformed)
从sklearn.metrics导入混淆矩阵
打印(混淆矩阵(y\u test,y\u pred))
编辑:我成功了。参数pos\u label='positive'
必须按如下方式传递给精度评分函数:y\u pred=clf.predict(X\u test\u transformed)
从sklearn.metrics导入精度评分打印(精度评分(y\u test,y\u pred,pos\u label='positive'))
R,我已尝试使用上述编辑的代码。但我遇到了一个错误:“选择另一个平均值设置”。%y_type)ValueError:Target是多类的,但average='binary'。请选择另一个平均值设置。因此,我将属性pos_label='positive'替换为average='micro',从而更改了精度评分和召回评分。打印(“精度分数:”,精度分数(y_测试,y_pred,average='micro'))打印(“回忆分数:”,回忆分数(y_测试,y_pred,average='micro'))
for item in list():
item = item.lower()
# These three will not used, do not import them
# from sklearn.preprocessing import MultiLabelBinarizer
# from sklearn.model_selection import train_test_split
# from sklearn.metrics import confusion_matrix
# This performs the classification task that you want with your input data in the format provided
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
def load_data(filename):
""" This function works, but you have to modify the second-to-last line from
reviews.append(line[0].split()) to reviews.append(line[0]).
CountVectorizer will perform the splits by itself as it sees fit, trust him :)"""
reviews = list()
labels = list()
with open(filename) as file:
file.readline()
for line in file:
line = line.strip().split(',')
labels.append(line[1])
reviews.append(line[0])
return reviews, labels
X_train, y_train = load_data('train.txt')
X_test, y_test = load_data('test.txt')
vec = CountVectorizer()
# Notice: clf means classifier, not vectorizer.
# While it is syntactically correct, it's bad practice to give misleading names to your objects.
# Replace "clf" with "vec" or something similar.
# Important! you called only the fit method, but did not transform the data
# afterwards. The fit method does not return the transformed data by itself. You
# either have to call .fit() and then .transform() on your training data, or just fit_transform() once.
X_train_transformed = vec.fit_transform(X_train)
X_test_transformed = vec.transform(X_test)
clf= MultinomialNB()
clf.fit(X_train_transformed, y_train)
score = clf.score(X_test_transformed, y_test)
print("score of Naive Bayes algo is :" , score)
score of Naive Bayes algo is : 0.5