Python 2.7 k最近邻中浮点值的文本无效

Python 2.7 k最近邻中浮点值的文本无效,python-2.7,pandas,Python 2.7,Pandas,我最难弄明白为什么我会犯这个错误。我搜索了很多,但找不到任何解决方案 import numpy as np import warnings from collections import Counter import pandas as pd def k_nearest_neighbors(data, predict, k=3): if len(data) >= k: warnings.warn('K is set to a value less than total votin

我最难弄明白为什么我会犯这个错误。我搜索了很多,但找不到任何解决方案

import numpy as np
import warnings
from collections import Counter
import pandas as pd

def k_nearest_neighbors(data, predict, k=3):
if len(data) >= k:
    warnings.warn('K is set to a value less than total voting groups!')
distances = []
for group in data:
    for features in data[group]:
        euclidean_distance = np.linalg.norm(np.array(features)-
np.array(predict))
        distances.append([euclidean_distance,group])
votes = [i[1] for i in sorted(distances)[:k]]
vote_result = Counter(votes).most_common(1)[0][0]
return vote_result

df = pd.read_csv("data.txt")
df.replace('?',-99999, inplace=True)
df.drop(['id'], 1, inplace=True)
full_data = df.astype(float).values.tolist()

print(full_data)
跑步之后。这是错误的

Traceback (most recent call last):
File "E:\Jazab\Machine Learning\Lec18(Testing K Neatest Nerighbors 
Classifier)\Lec18(Testing K Neatest Nerighbors 
Classifier)\Lec18_Testing_K_Neatest_Nerighbors_Classifier_.py", line 25, in 
<module>
full_data = df.astype(float).values.tolist()
File "C:\Python27\lib\site-packages\pandas\util\_decorators.py", line 91, in 
wrapper
return func(*args, **kwargs)
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 3299, in 
astype
**kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 3224, in 
astype
return self.apply('astype', dtype=dtype, **kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 3091, in 
apply
applied = getattr(b, f)(**kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 471, in 
astype
**kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 521, in 
_astype
values = astype_nansafe(values.ravel(), dtype, copy=True)
File "C:\Python27\lib\site-packages\pandas\core\dtypes\cast.py", line 636, 
in astype_nansafe
return arr.astype(dtype)
ValueError: invalid literal for float(): 3) <-----Reappears in Group 8 as:
Press any key to continue . . .
回溯(最近一次呼叫最后一次):
文件“E:\Jazab\Machine Learning\Lec18(测试Neak Neerighbors
分类器)\ Lec18(测试NEAK测试Nerighbors
分类器)\ Lec18\u测试\u K\u Neatest\u Nerighbors\u分类器\u.py“,第25行,in
full_data=df.astype(float.values.tolist())
文件“C:\Python27\lib\site packages\pandas\util\\u decorators.py”,第91行,在
包装纸
返回函数(*args,**kwargs)
文件“C:\Python27\lib\site packages\pandas\core\generic.py”,第3299行,在
A型
**kwargs)
文件“C:\Python27\lib\site packages\pandas\core\internals.py”,第3224行,在
A型
返回self.apply('astype',dtype=dtype,**kwargs)
文件“C:\Python27\lib\site packages\pandas\core\internals.py”,第3091行,在
应用
应用=getattr(b,f)(**kwargs)
文件“C:\Python27\lib\site packages\pandas\core\internals.py”,第471行,在
A型
**kwargs)
文件“C:\Python27\lib\site packages\pandas\core\internals.py”,第521行,在
_A型
values=astype_nansafe(values.ravel(),dtype,copy=True)
文件“C:\Python27\lib\site packages\pandas\core\dtypes\cast.py”,第636行,
在astype_nansafe
返回arr.astype(dtype)

ValueError:float()的文本无效:3)看起来您的CSV文件中有
3)
作为条目,Pandas正在抱怨,因为它无法将其转换为float,因为
存在错误数据(
3)
),所以需要处理所有列

非数值被转换为
NaN
s,并替换为某个标量,例如
0

full_data = df.apply(pd.to_numeric, errors='coerce').fillna(0).values.tolist()
样本:

df = pd.DataFrame({'A':[1,2,7], 'B':['3)',4,5]})
print (df)
   A   B
0  1  3)
1  2   4
2  7   5

full_data = df.apply(pd.to_numeric, errors='coerce').fillna(0).values.tolist()
print (full_data)
[[1.0, 0.0], [2.0, 4.0], [7.0, 5.0]]

我的CSV有患者记录,在你的问题中,可能会显示一个文件示例。无论如何,我认为如果您无法控制源数据,那么jezrael的答案就是您所需要的。