Python 如何根据包含的数据类型为float的情况从数据帧中删除行？_Python_Pandas

Python 如何根据包含的数据类型为float的情况从数据帧中删除行？

python pandas

Python 如何根据包含的数据类型为float的情况从数据帧中删除行？,python,pandas,Python,Pandas,我正在使用数据帧。我知道你可以做如下事情： dataframe[dataframe["column_name"] : some condition] dataframe[type(dataframe["column_name"]) == float ] import numpy as np, pandas as pd df1 = pd.DataFrame({ "B":[5, 2, 54, 3

我正在使用数据帧。我知道你可以做如下事情：

dataframe[dataframe["column_name"] :  some condition]

 dataframe[type(dataframe["column_name"]) == float ]

import numpy as np, pandas as pd
df1 = pd.DataFrame({
                   "B":[5, 2, 54, 3, 2], 
                   "C":[20, 16, np.nan, 3, 8], 
                   "D":[14, 3, 17, 2, 6]}) 
df1.loc[df1.isna().apply(sum,axis=1) == 0]

但我想要的是：

dataframe[dataframe["column_name"] :  some condition]

 dataframe[type(dataframe["column_name"]) == float ]

import numpy as np, pandas as pd
df1 = pd.DataFrame({
                   "B":[5, 2, 54, 3, 2], 
                   "C":[20, 16, np.nan, 3, 8], 
                   "D":[14, 3, 17, 2, 6]}) 
df1.loc[df1.isna().apply(sum,axis=1) == 0]

例如，如果我们有以下数据集：

A    B    C    D
1    2    3    4
5    6         4
7    2    3    2
1    2    3    4

然后，我想删除第二行，因为在第2行的C列下，该值要么缺失，要么不是数字（表示该值缺失）

但我试过的方法不起作用。我得到以下错误。有人能帮忙吗

Warning (from warnings module):
  File "/Users/oishikachaudhury/Desktop/NYU/Risk Econ/Week 6/Hourly/trial.py", line 1
    import matplotlib.pyplot as plt
DtypeWarning: Columns (9,15,20,27,33,34,35,36,38,39,60) have mixed types.Specify dtype option on import or set low_memory=False.
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: False

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/oishikachaudhury/Desktop/NYU/Risk Econ/Week 6/Hourly/trial.py", line 8, in <module>
    dewpoint = fileObj[type(fileObj["HourlyDewPointTemperature"]) == float]
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 2800, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2648, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: False

警告（来自警告模块）：
文件“/Users/oishikachaudhury/Desktop/NYU/Risk Econ/Week 6/Hourly/trial.py”，第1行
将matplotlib.pyplot作为plt导入
数据类型警告：列（9,15,20,27,33,34,35,36,38,39,60）具有混合类型。请在导入时指定数据类型选项或将低内存设置为False。
回溯（最近一次呼叫最后一次）：
get_loc中的文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site packages/pandas/core/index/base.py”，第2646行
返回发动机。获取位置（钥匙）
文件“pandas/_libs/index.pyx”，第111行，在pandas._libs.index.IndexEngine.get_loc中
文件“pandas/_libs/index.pyx”，第138行，在pandas._libs.index.IndexEngine.get_loc中
文件“pandas/_libs/hashtable_class_helper.pxi”，第1619行，在pandas._libs.hashtable.PyObjectHashTable.get_项中
文件“pandas/_libs/hashtable_class_helper.pxi”，第1627行，在pandas._libs.hashtable.PyObjectHashTable.get_项中
KeyError:错误
在处理上述异常期间，发生了另一个异常：
回溯（最近一次呼叫最后一次）：
文件“/Users/oishikachaudhury/Desktop/NYU/Risk Econ/Week 6/Hourly/trial.py”，第8行，在
露点=fileObj[类型（fileObj[“HourlyDewPointTemperature”]）==float]
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site packages/pandas/core/frame.py”，第2800行，在__
indexer=self.columns.get_loc（键）
get_loc中的文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site packages/pandas/core/index/base.py”，第2648行
返回self.\u引擎。获取\u loc（self.\u可能\u cast\u索引器（键））
文件“pandas/_libs/index.pyx”，第111行，在pandas._libs.index.IndexEngine.get_loc中
文件“pandas/_libs/index.pyx”，第138行，在pandas._libs.index.IndexEngine.get_loc中
文件“pandas/_libs/hashtable_class_helper.pxi”，第1619行，在pandas._libs.hashtable.PyObjectHashTable.get_项中
文件“pandas/_libs/hashtable_class_helper.pxi”，第1627行，在pandas._libs.hashtable.PyObjectHashTable.get_项中
KeyError:错误

您可能需要以下内容：

dataframe[dataframe["column_name"] :  some condition]

 dataframe[type(dataframe["column_name"]) == float ]

import numpy as np, pandas as pd
df1 = pd.DataFrame({
                   "B":[5, 2, 54, 3, 2], 
                   "C":[20, 16, np.nan, 3, 8], 
                   "D":[14, 3, 17, 2, 6]}) 
df1.loc[df1.isna().apply(sum,axis=1) == 0]

输出：

   B     C   D
0  5  20.0  14
1  2  16.0   3
3  3   3.0   2
4  2   8.0   6

由于OP试图删除浮点类型的行，而不是列，因此有一个解决方案：

df = pd.DataFrame({'A':['a', 'b', 'c', 'd'],'B': ['e', 'f', 1.2, 'g'], 'C': ["asdf",3.2,"s","d"]})

# Setup list of rows to keep
keeprows=[]

# Loop through each row in DF
for idx,row in enumerate(df.iterrows()):
    validcols = 0 # Count number of columns without float types
    for val in list(row[1]):
        if not type(val) == float:
            validcols+=1 # add one to column counter if value not float type
    if validcols != len(df.columns):
        continue
    else:
        keeprows.append(row[1]) # if all cols are not float, append to keep list

filtered = pd.concat(keeprows, axis = 1)
print(filtered)

这使得：

    A   B   C
0   a   e   asdf
3   d   g   d

与原始数据帧相比：

    A   B   C
0   a   e   asdf
1   b   f   3.2
2   c   1.2 s
3   d   g   d

不幸的是，这冗长且缓慢（因为它在每一行上循环），并且可能会得到改进。

这只是关于缺少值，还是可能存在数字以外的内容，例如字母？如果特定列下的单元格中的值是floatHello，我试图在某些列下只保留具有浮点类型的行。但这是在删除列C。我不想删除列C，我想删除行3。@user1234，是的，对不起，我一定是看错了-您现在可以查看代码吗？