Python 面对属性错误:';列表';对象没有属性';下';

Python 面对属性错误:';列表';对象没有属性';下';,python,scikit-learn,Python,Scikit Learn,我已经发布了我的样本列车数据以及测试数据以及我的代码。我尝试使用朴素贝叶斯算法来训练模型 但是,在评论中我得到了一个列表。因此,我认为我的代码出现以下错误: return lambda x: strip_accents(x.lower()) AttributeError: 'list' object has no attribute 'lower' 你们谁能帮我解决这个问题,因为我是python新手 train.txt: test.txt: 我的代码: 您需要遍历列表中的每个元素 for it

我已经发布了我的样本列车数据以及测试数据以及我的代码。我尝试使用朴素贝叶斯算法来训练模型

但是,在评论中我得到了一个列表。因此,我认为我的代码出现以下错误:

return lambda x: strip_accents(x.lower())
AttributeError: 'list' object has no attribute 'lower'
你们谁能帮我解决这个问题,因为我是python新手

train.txt: test.txt: 我的代码:
您需要遍历列表中的每个元素

for item in list():
      item = item.lower()

注意:只有在遍历字符串列表(dtype=str)时才适用。

我对代码进行了一些修改。贴在下面的那一个工作;我添加了关于如何调试上面发布的一个的注释

# These three will not used, do not import them
# from sklearn.preprocessing import MultiLabelBinarizer 
# from sklearn.model_selection import train_test_split 
# from sklearn.metrics import confusion_matrix

# This performs the classification task that you want with your input data in the format provided
from sklearn.naive_bayes import MultinomialNB 

from sklearn.feature_extraction.text import CountVectorizer

def load_data(filename):
    """ This function works, but you have to modify the second-to-last line from
    reviews.append(line[0].split()) to reviews.append(line[0]).
    CountVectorizer will perform the splits by itself as it sees fit, trust him :)"""
    reviews = list()
    labels = list()
    with open(filename) as file:
        file.readline()
        for line in file:
            line = line.strip().split(',')
            labels.append(line[1])
            reviews.append(line[0])

    return reviews, labels

X_train, y_train = load_data('train.txt')
X_test, y_test = load_data('test.txt')

vec = CountVectorizer() 
# Notice: clf means classifier, not vectorizer. 
# While it is syntactically correct, it's bad practice to give misleading names to your objects. 
# Replace "clf" with "vec" or something similar.

# Important! you called only the fit method, but did not transform the data 
# afterwards. The fit method does not return the transformed data by itself. You 
# either have to call .fit() and then .transform() on your training data, or just fit_transform() once.

X_train_transformed =  vec.fit_transform(X_train) 

X_test_transformed = vec.transform(X_test)

clf= MultinomialNB()
clf.fit(X_train_transformed, y_train)

score = clf.score(X_test_transformed, y_test)
print("score of Naive Bayes algo is :" , score)
此代码的输出为:

score of Naive Bayes algo is : 0.5

lower()
不是
列表的属性。尝试将其转换为
numpy
数组。然后
.lower()
应该可以工作。在代码中
返回lambda x:strip\u重音(x.lower())
在哪里?这是一段简单的代码。你可以自己管理它,你能在这里编辑我的代码吗。因为,我是python的新手。评论包含listHi@Daniel R的列表。我们如何计算精度和召回率?Mhmh,这种格式的数据似乎允许从sklearn.metrics计算混淆矩阵,但精度和召回率都不允许。我将对此进行研究,但现在您可以打印混淆矩阵并从中手动计算它们,方法是添加:
y\u pred=clf.predict(X\u test\u transformed)
从sklearn.metrics导入混淆矩阵
打印(混淆矩阵(y\u test,y\u pred))
编辑:我成功了。参数
pos\u label='positive'
必须按如下方式传递给精度评分函数:
y\u pred=clf.predict(X\u test\u transformed)
从sklearn.metrics导入精度评分
打印(精度评分(y\u test,y\u pred,pos\u label='positive'))
R,我已尝试使用上述编辑的代码。但我遇到了一个错误:“选择另一个平均值设置”。%y_type)ValueError:Target是多类的,但average='binary'。请选择另一个平均值设置。因此,我将属性pos_label='positive'替换为average='micro',从而更改了精度评分和召回评分。打印(“精度分数:”,精度分数(y_测试,y_pred,average='micro'))打印(“回忆分数:”,回忆分数(y_测试,y_pred,average='micro'))
for item in list():
      item = item.lower()
# These three will not used, do not import them
# from sklearn.preprocessing import MultiLabelBinarizer 
# from sklearn.model_selection import train_test_split 
# from sklearn.metrics import confusion_matrix

# This performs the classification task that you want with your input data in the format provided
from sklearn.naive_bayes import MultinomialNB 

from sklearn.feature_extraction.text import CountVectorizer

def load_data(filename):
    """ This function works, but you have to modify the second-to-last line from
    reviews.append(line[0].split()) to reviews.append(line[0]).
    CountVectorizer will perform the splits by itself as it sees fit, trust him :)"""
    reviews = list()
    labels = list()
    with open(filename) as file:
        file.readline()
        for line in file:
            line = line.strip().split(',')
            labels.append(line[1])
            reviews.append(line[0])

    return reviews, labels

X_train, y_train = load_data('train.txt')
X_test, y_test = load_data('test.txt')

vec = CountVectorizer() 
# Notice: clf means classifier, not vectorizer. 
# While it is syntactically correct, it's bad practice to give misleading names to your objects. 
# Replace "clf" with "vec" or something similar.

# Important! you called only the fit method, but did not transform the data 
# afterwards. The fit method does not return the transformed data by itself. You 
# either have to call .fit() and then .transform() on your training data, or just fit_transform() once.

X_train_transformed =  vec.fit_transform(X_train) 

X_test_transformed = vec.transform(X_test)

clf= MultinomialNB()
clf.fit(X_train_transformed, y_train)

score = clf.score(X_test_transformed, y_test)
print("score of Naive Bayes algo is :" , score)
score of Naive Bayes algo is : 0.5