Python 3.x 如何根据条件删除熊猫中的行?
我有以下数据框Python 3.x 如何根据条件删除熊猫中的行?,python-3.x,pandas,Python 3.x,Pandas,我有以下数据框 df = pd.DataFrame([['1','aa','ccc','rere','thth','my desc 1','','my feature2 1'], ['1','aa','fff','flfl','ipip','my desc 2','',''], ['1','aa','mmm','rprp','','','',''], ['2','aa','ccc','rprp','','','my feature1 1',''], ['2','aa','fff','bubu',
df = pd.DataFrame([['1','aa','ccc','rere','thth','my desc 1','','my feature2 1'], ['1','aa','fff','flfl','ipip','my desc 2','',''], ['1','aa','mmm','rprp','','','',''], ['2','aa','ccc','rprp','','','my feature1 1',''], ['2','aa','fff','bubu','thth','my desc 3','',''], ['2','aa','mmm','fafa','rtrt','my desc 4','',''], ['3','aa','ccc','blbl','thth','my desc 5','my feature1 2','my feature2 2'], ['3','aa','fff','arar','amam','my desc 6','',''], ['3','aa','mmm','acac','ryry','my desc 7','',''],['4','bb','coco','rere','','','','my feature2 3'], ['4','bb','inin','mimi','rere','my desc 8','',''], ['4','bb','itit','toto','enen','my desc 9','',''], ['4','bb','spsp','glgl','pepe','my desc 10','',''], ['5','bb','coco','baba','mpmp','my desc 11','my feature1 3',''], ['5','bb','inin','rere','','','',''],['5','bb','itit','toto','hrhr','my desc 12','',''], ['5','bb','spsp','glgl','lolo','my desc 13','','']], columns=['foo', 'bar','name_input','value_input','bulb','desc','feature1', 'feature2'])
现在,我需要删除行以获得下面的输出
df = pd.DataFrame([['1','aa','ccc','rere','thth','my desc 1','','my feature2 1'], ['2','aa','ccc','rprp','','my desc 3','my feature1 1',''], ['3','aa','ccc','blbl','thth','my desc 5','my feature1 2','my feature2 2'], ['4','bb','coco','rere','','my desc 8','','my feature2 3'], ['5','bb','coco','baba','mpmp','my desc 11','my feature1 3','']], columns=['foo', 'bar','name_input','value_input','bulb','desc','feature1', 'feature2'])
我试过下面的方法。它们似乎都不起作用
df= df.dropna(subset=['feature1', 'feature2'])
df.dropna(thresh=5, axis=0, inplace=True)
df= df[df.feature2.notnull()]
df= df[pd.notnull(df[['feature1', 'feature2']])]
非常感谢您的帮助 astype(bool)
在布尔上下文中,空字符串的计算结果为False
。使用filter
仅获取以feature
开头的列。然后使用astype(bool)
,后跟any(axis=1)
为了匹配您的结果,我们可以倒填desc
列
feat = df.filter(regex='feat').astype(bool).any(1)
desc = df.desc.where(df.desc.astype(bool)).bfill()
df.assign(desc=desc)[feat]
foo bar name_input value_input bulb desc feature1 feature2
0 1 aa ccc rere thth my desc 1 my feature2 1
3 2 aa ccc rprp my desc 3 my feature1 1
6 3 aa ccc blbl thth my desc 5 my feature1 2 my feature2 2
9 4 bb coco rere my desc 8 my feature2 3
13 5 bb coco baba mpmp my desc 11 my feature1 3
另一种方法是将空白字符串更改为true
NaN
值,然后将how
参数传递给dropna
并使用all
作为值
import numpy as np
df.replace('',np.nan).dropna(subset=['feature1','feature2'],how='all').fillna('')
foo bar name_input value_input bulb desc feature1 feature2
0 1 aa ccc rere thth my desc 1 my feature2 1
3 2 aa ccc rprp my feature1 1
6 3 aa ccc blbl thth my desc 5 my feature1 2 my feature2 2
9 4 bb coco rere my feature2 3
13 5 bb coco baba mpmp my desc 11 my feature1 3
非常圆滑,不知道字符串求值为False,谢谢你的解释。谢谢。但是,“desc”列应该包含所有值,包括“my desc 3”和“my desc 8”。我们是如何得到它的?你需要解释为什么“我的描述3”是这样的。对于单个的foo,描述值是强制性的。和特征1、特征2取决于灯泡。如果灯泡为空,则其中一个功能将为空,但desc将始终显示值。需要每个foo的第一个可用描述
import numpy as np
df.replace('',np.nan).dropna(subset=['feature1','feature2'],how='all').fillna('')
foo bar name_input value_input bulb desc feature1 feature2
0 1 aa ccc rere thth my desc 1 my feature2 1
3 2 aa ccc rprp my feature1 1
6 3 aa ccc blbl thth my desc 5 my feature1 2 my feature2 2
9 4 bb coco rere my feature2 3
13 5 bb coco baba mpmp my desc 11 my feature1 3