Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/amazon-web-services/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Groupby和any()| all()_Python_Pandas - Fatal编程技术网

Python Groupby和any()| all()

Python Groupby和any()| all(),python,pandas,Python,Pandas,我有以下pd.DataFrame In [155]: df1 Out[155]: ORDER_ID ACQ DATE UID 2 3 False 2014-01-03 1 3 4 True 2014-01-04 2 4 5 False 2014-01-05 3 6 7 True 2014-01-08 5 7 8 False 2014-01-08 5 9

我有以下
pd.DataFrame

In [155]: df1
Out[155]: 
   ORDER_ID    ACQ       DATE UID
2         3  False 2014-01-03   1
3         4   True 2014-01-04   2
4         5  False 2014-01-05   3
6         7   True 2014-01-08   5
7         8  False 2014-01-08   5
9        10  False 2014-01-10   6
0        11  False 2014-01-11   6
其中每个条目都是一个订单,带有
order\u ID
DATE
UID
ACQ
的值(指示这是否是数据集中关联的
UID
的第一个订单)

我正在尝试筛选并保留在数据集涵盖的时间段内首次下单的用户下的所有订单(即,此类用户的至少一个订单满足
ACQ==True

因此,期望的输出是:

   ORDER_ID    ACQ       DATE UID
3         4   True 2014-01-04   2
6         7   True 2014-01-08   5
7         8  False 2014-01-08   5
我通过以下方式实现了这一目标:

In [156]: df1.groupby('UID').filter(lambda x: x.ACQ.any() == True)
Out[156]: 
   ORDER_ID    ACQ       DATE UID
3         4   True 2014-01-04   2
6         7   True 2014-01-08   5
7         8  False 2014-01-08   5
但是,当我试图查找在数据集所涵盖的时间段之外首次下单的用户下的所有订单(即,他们的所有订单都应满足
ACQ==False
)时,我似乎不知所措。我试过这个:

In [159]: df1.groupby('UID').filter(lambda x: x.ACQ.all() == False)
Out[159]: 
   ORDER_ID    ACQ       DATE UID
2         3  False 2014-01-03   1
4         5  False 2014-01-05   3
6         7   True 2014-01-08   5 ## <- This order is an acquisition, therefore all orders with UID == 5 should be filtered out.
7         8  False 2014-01-08   5
9        10  False 2014-01-10   6
0        11  False 2014-01-11   6
[159]中的
:df1.groupby('UID').filter(lambda x:x.ACQ.all()==False)
出[159]:
订单ID ACQ日期UID
2 3假2014-01-03 1
4 5假2014-01-05 3

6 7 True 2014-01-08 5##您首先需要使用条件,然后添加:

print (df1.groupby('UID').filter(lambda x: (x.ACQ == False).all()))
   ORDER_ID    ACQ        DATE  UID
2         3  False  2014-01-03    1
4         5  False  2014-01-05    3
9        10  False  2014-01-10    6
0        11  False  2014-01-11    6