'；浮动'；类型错误Python，pandas_Python_String_Machine Learning_Scikit Learn_Typeerror

'；浮动'；类型错误Python，pandas

python string machine-learning scikit-learn

'；浮动'；类型错误Python，pandas,python,string,machine-learning,scikit-learn,typeerror,Python,String,Machine Learning,Scikit Learn,Typeerror,使用unicode字符串数据（dtype对象）在数据帧中的列上迭代时，出现以下错误： in text_pre_processing(text) 2 # removing punctuation 3 #text = text1(r'\n',' ', regex=True) ----> 4 text1 = [char for char in text if char not in string.punctuation] 5 text1 = ''.join(text1)

使用unicode字符串数据（dtype对象）在数据帧中的列上迭代时，出现以下错误：

in text_pre_processing(text)  
2 # removing punctuation  
3 #text = text1(r'\n',' ', regex=True)  
----> 4 text1 = [char for char in text if char not in string.punctuation]  
5 text1 = ''.join(text1)  


**TypeError: 'float' object is not iterable**

使用的功能

def text_pre_processing(text):
    # removing punctuation
    #text1 = text1(r'\n',' ', regex=True)
    text1 =  [char for char in str(text) if char not in string.punctuation]
    text1 = ''.join(text1)

    # removing all the stop words from corpus 

    #return text.split()
    return[word for word in text1.split() if word not in stopwords.words('english')]

我试图查看输入函数的列im是否有任何浮点值（只有浮点值的句子），但没有这样做，因为“pandas”将alfa numeric和alpha值视为数据类型“object”，显式类型转换无法工作

有人知道出了什么问题吗

我将此函数用作naivebayes算法分析器的一部分

数据：第1列是索引

Column2

this is a good movie...#    

this is a bad movie $....     

this #movie was good ;) but some scenes were exaggerating

预期产出：

[this, good, movie]    
[this, bad, movie ]    
[this, movie, good, some, scenes, were, exaggerating]

您需要将浮点数转换为字符串：

>>> str(3.14159)
'3.14159'

您可以将

文本

包装回字符串：

[char for char in str（text），如果char不在string中。标点符号]

为什么要在列上迭代？我闻到一个XY问题。请显示您的数据和预期输出。就性能而言，迭代是数据帧所能做的最糟糕的事情。我99%确信

pd.Series.str.replace

更适合您的问题。@hoefling我尝试了这个方法，但仍然不起作用……并且还尝试显式地将列强制转换为字符串D1['column']=D1['column'].astype（str）@cᴏʟᴅsᴘᴇᴇᴅ 我对这个问题做了一些修改，希望现在问题清楚了。