python pd.dataframe问题,索引13给出错误?
正如你在下面看到的,我的proteinID数据框有4292个成员,当我试图打印它们时,我在索引13处得到一个错误,我不明白为什么 知道发生了什么吗python pd.dataframe问题,索引13给出错误?,python,pandas,Python,Pandas,正如你在下面看到的,我的proteinID数据框有4292个成员,当我试图打印它们时,我在索引13处得到一个错误,我不明白为什么 知道发生了什么吗 print proteinID.shape print X_final.shape for i,prot in enumerate(X_final): print i print prot print proteinID[i] 这给了我: (4292L,) (4292L, 4L) 0 [ 0.01070217 0.8
print proteinID.shape
print X_final.shape
for i,prot in enumerate(X_final):
print i
print prot
print proteinID[i]
这给了我:
(4292L,)
(4292L, 4L)
0
[ 0.01070217 0.86624627 0.30031799 1.0022054 ]
Q9BV57
1
[ 0.14132098 0.5899623 -0.08037944 0.04028686]
Q04446
2
[ 0.14768145 0.37698604 -0.08798323 -0.71181829]
P61604
3
[ 0.23194252 -0.17301326 -0.20914528 0.27447231]
Q15029
4
[ 0.13608163 0.41788998 0.06103427 -0.1557695 ]
Q9NRX4
5
[ 0.11981057 0.62419406 0.085566 0.43029529]
P31946
6
[ 0.14734698 0.53942167 0.1647835 0.20525244]
P62258
7
[ 0.13301821 0.25249911 0.32216093 0.46965642]
Q04917
8
[ 0.30891193 0.35936887 0.14029331 0.22116058]
P61981
9
[ 0.15670011 -0.0317209 0.48168144 0.58226224]
P31947;REV__Q13315
10
[ 0.059664 0.52769527 0.09302036 0.28445371]
P27348
11
[ 0.22201161 0.703846 0.19846719 0.53470435]
P63104
12
[ 0.53312759 0.48972197 -0.15224852 0.16086491]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-54-45a793f9a457> in <module>()
4 print i
5 print prot
----> 6 print proteinID[i]
C:\Anaconda\lib\site-packages\pandas\core\series.pyc in __getitem__(self, key)
507 def __getitem__(self, key):
508 try:
--> 509 result = self.index.get_value(self, key)
510
511 if not np.isscalar(result):
C:\Anaconda\lib\site-packages\pandas\core\index.pyc in get_value(self, series,
key)
1415
1416 try:
-> 1417 return self._engine.get_value(s, k)
1418 except KeyError as e1:
1419 if len(self) > 0 and self.inferred_type in
['integer','boolean']:
pandas\index.pyx in pandas.index.IndexEngine.get_value (pandas\index.c:3109)()
pandas\index.pyx in pandas.index.IndexEngine.get_value (pandas\index.c:2840)()
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:3700)()
pandas\hashtable.pyx in pandas.hashtable.Int64HashTable.get_item
(pandas\hashtable.c:7229)()
pandas\hashtable.pyx in pandas.hashtable.Int64HashTable.get_item
(pandas\hashtable.c:7167)()
KeyError: 12L
我注意到在使用以下命令删除NaN值后:
#instead of imputing, we remove rows with nan values
valid_mask = [np.all(~np.isnan(x)) for x in data.values]
print data[valid_mask].shape
X_imputed = data[valid_mask].values
proteinID = proteinID[valid_mask]
索引是保留的,因此在这种情况下,缺少的索引过去是一个带有NaN值的行。错误很明显,您没有第12列,请发布原始输入数据,并编写代码以再现错误感谢您的响应,但我们这里讨论的是行而不是列,我将用更多信息更新帖子!不,您不需要proteinID[i]这是尝试访问一列只有一列,因此。。
#instead of imputing, we remove rows with nan values
valid_mask = [np.all(~np.isnan(x)) for x in data.values]
print data[valid_mask].shape
X_imputed = data[valid_mask].values
proteinID = proteinID[valid_mask]