Python 熊猫错误“;只能对字符串值使用.str访问器;
我有以下输入文件:Python 熊猫错误“;只能对字符串值使用.str访问器;,python,string,pandas,casting,dataframe,Python,String,Pandas,Casting,Dataframe,我有以下输入文件: "Name",97.7,0A,0A,65M,0A,100M,5M,75M,100M,90M,90M,99M,90M,0#,0N#, 我正在用以下文字阅读: #!/usr/bin/env python import pandas as pd import sys import numpy as np filename = sys.argv[1] df = pd.read_csv(filename,header=None) for col in df.columns[2:]
"Name",97.7,0A,0A,65M,0A,100M,5M,75M,100M,90M,90M,99M,90M,0#,0N#,
我正在用以下文字阅读:
#!/usr/bin/env python
import pandas as pd
import sys
import numpy as np
filename = sys.argv[1]
df = pd.read_csv(filename,header=None)
for col in df.columns[2:]:
df[col] = df[col].str.extract(r'(\d+\.*\d*)').astype(np.float)
print df
然而,我得到了错误
df[col] = df[col].str.extract(r'(\d+\.*\d*)').astype(np.float)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 2241, in __getattr__
return object.__getattribute__(self, name)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/base.py", line 188, in __get__
return self.construct_accessor(instance)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/base.py", line 528, in _make_str_accessor
raise AttributeError("Can only use .str accessor with string "
AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas
这在pandas 0.14中正常工作,但在pandas 0.17.0中不起作用。发生这种情况是因为您的最后一列为空,因此它将转换为
NaN
:
In [417]:
t="""'Name',97.7,0A,0A,65M,0A,100M,5M,75M,100M,90M,90M,99M,90M,0#,0N#,"""
df = pd.read_csv(io.StringIO(t), header=None)
df
Out[417]:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 \
0 'Name' 97.7 0A 0A 65M 0A 100M 5M 75M 100M 90M 90M 99M 90M 0#
15 16
0 0N# NaN
如果将范围切至最后一行,则其有效:
In [421]:
for col in df.columns[2:-1]:
df[col] = df[col].str.extract(r'(\d+\.*\d*)').astype(np.float)
df
Out[421]:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0 'Name' 97.7 0 0 65 0 100 5 75 100 90 90 99 90 0 0 NaN
或者,您可以选择属于对象的列
dtype并运行代码(跳过第一个列,因为这是“Name”条目):
我在Eclipse中工作时遇到了这个错误。事实证明,项目解释器不知何故(在我相信的一次更新之后)被重置为Python2.7。将其设置回Python 3.6解决了这个问题。这一切都导致了几次崩溃、重启和警告。经过几分钟的麻烦之后,现在似乎已经解决了
虽然我知道这不是这里提出的问题的解决方案,但我认为它可能对其他人有用,因为我是在搜索了这个错误之后来到这个页面的 在这种情况下,我们必须对该系列使用
str.replace()
方法,但首先我们必须将其转换为str
类型:
df1.Patient = 's125','s45',s588','s244','s125','s123'
df1 = pd.read_csv("C:\\Users\\Gangwar\\Desktop\\competitions\\cancer prediction\\kaggle_to_students.csv")
df1.Patient = df1.Patient.astype(str)
df1['Patient'] = df1['Patient'].str.replace('s','').astype(int)
谢谢这在0.17中是新的吗?因为给出了警告
/usr/local/lib/python3.6/dist packages/ipykernel_launcher.py:2:FutureWarning:Current extract(expand=None)表示expand=False(返回索引/系列/数据帧),但在熊猫的未来版本中,这将更改为expand=True(返回数据帧)
df1.Patient = 's125','s45',s588','s244','s125','s123'
df1 = pd.read_csv("C:\\Users\\Gangwar\\Desktop\\competitions\\cancer prediction\\kaggle_to_students.csv")
df1.Patient = df1.Patient.astype(str)
df1['Patient'] = df1['Patient'].str.replace('s','').astype(int)