Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/string/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
';浮动';类型错误Python,pandas_Python_String_Machine Learning_Scikit Learn_Typeerror - Fatal编程技术网

';浮动';类型错误Python,pandas

';浮动';类型错误Python,pandas,python,string,machine-learning,scikit-learn,typeerror,Python,String,Machine Learning,Scikit Learn,Typeerror,使用unicode字符串数据(dtype对象)在数据帧中的列上迭代时,出现以下错误: in text_pre_processing(text) 2 # removing punctuation 3 #text = text1(r'\n',' ', regex=True) ----> 4 text1 = [char for char in text if char not in string.punctuation] 5 text1 = ''.join(text1)

使用unicode字符串数据(dtype对象)在数据帧中的列上迭代时,出现以下错误:

in text_pre_processing(text)  
2 # removing punctuation  
3 #text = text1(r'\n',' ', regex=True)  
----> 4 text1 = [char for char in text if char not in string.punctuation]  
5 text1 = ''.join(text1)  


**TypeError: 'float' object is not iterable**
使用的功能

def text_pre_processing(text):
    # removing punctuation
    #text1 = text1(r'\n',' ', regex=True)
    text1 =  [char for char in str(text) if char not in string.punctuation]
    text1 = ''.join(text1)

    # removing all the stop words from corpus 

    #return text.split()
    return[word for word in text1.split() if word not in stopwords.words('english')]
我试图查看输入函数的列im是否有任何浮点值(只有浮点值的句子),但没有这样做,因为“pandas”将alfa numeric和alpha值视为数据类型“object”,显式类型转换无法工作

有人知道出了什么问题吗

我将此函数用作naivebayes算法分析器的一部分

数据: 第1列是索引

Column2

this is a good movie...#    

this is a bad movie $....     

this #movie was good ;) but some scenes were exaggerating    
预期产出:

[this, good, movie]    
[this, bad, movie ]    
[this, movie, good, some, scenes, were, exaggerating]    

您需要将浮点数转换为字符串:

>>> str(3.14159)
'3.14159'

您可以将
文本
包装回字符串:
[char for char in str(text),如果char不在string中。标点符号]
为什么要在列上迭代?我闻到一个XY问题。请显示您的数据和预期输出。就性能而言,迭代是数据帧所能做的最糟糕的事情。我99%确信
pd.Series.str.replace
更适合您的问题。@hoefling我尝试了这个方法,但仍然不起作用……并且还尝试显式地将列强制转换为字符串D1['column']=D1['column'].astype(str)@cᴏʟᴅsᴘᴇᴇᴅ 我对这个问题做了一些修改,希望现在问题清楚了。