Python 条件数据帧分组_Python_Pandas_Dataframe

Python 条件数据帧分组

python pandas dataframe

Python 条件数据帧分组,python,pandas,dataframe,Python,Pandas,Dataframe,我试图找出每个月沃尔玛和食品狮子的平均值，但是我得到了下面的HEBs收入数据 df = pd.DataFrame({'date': ['1960-01-01','1960-01-01','1960-01-01','1960-02-01','1960-02-01','1960-02-01', '1961-01-01','1961-01-01','1961-01-01','1961-02-01','1961-02-01','1961-02-01

我试图找出每个月沃尔玛和食品狮子的平均值，但是我得到了下面的HEBs收入数据

df = pd.DataFrame({'date': ['1960-01-01','1960-01-01','1960-01-01','1960-02-01','1960-02-01','1960-02-01',
                            '1961-01-01','1961-01-01','1961-01-01','1961-02-01','1961-02-01','1961-02-01'],
                   'Company': ['HEB', 'Walmart', 'Food Lion','HEB', 'Walmart', 'Food Lion',
                              'HEB', 'Walmart', 'Food Lion','HEB', 'Walmart', 'Food Lion'],
                   'Revenue': [200, 800, 400, 400, 300, 600, 400, 400, 900, 900, 800, 600]})

print(df)

输出：

公司收入日期
0 1960-01-01 HEB 200
1 1960-01-01沃尔玛800
2 1960-01-01食狮400
3 1960-02-01 HEB 400
4 1960-02-01沃尔玛300
5 1960-02-01食狮600
6 1961-01-01 HEB 400
7 1961-01-01沃尔玛400
8 1961-01-01食狮900
9 1961-02-01 HEB 900
10 1961-02-01沃尔玛800
11 1961-02-01食狮600

我试图不将HEBs数据包含在此

groupby

中。我该怎么做

df.groupby('date')['Revenue'].mean()

日期
1960-01-01    466.666667
1960-02-01    433.333333
1961-01-01    566.666667
1961-02-01    766.666667
名称：Value，数据类型：float64

有几种方法可以做到这一点。也许最简单的方法是简单地从您分组的数据中排除“HEB”：

df[df.Company != "HEB"].groupby("date")["Revenue"].mean()

要获得单个公司，您可以使用

df = df[df['Company'] == 'Walmart']

print(df)
          date    Company  Revenue
1   1960-01-01    Walmart    800
4   1960-02-01    Walmart    300
7   1961-01-01    Walmart    400
10  1961-02-01    Walmart    800

df = df[df['Company'] != 'HEB']

print(df)
          date    Company  Revenue
1   1960-01-01    Walmart    800
2   1960-01-01  Food Lion    400
4   1960-02-01    Walmart    300
5   1960-02-01  Food Lion    600
7   1961-01-01    Walmart    400
8   1961-01-01  Food Lion    900
10  1961-02-01    Walmart    800
11  1961-02-01  Food Lion    600

如果要排除某个公司，可以使用

df = df[df['Company'] == 'Walmart']

print(df)
          date    Company  Revenue
1   1960-01-01    Walmart    800
4   1960-02-01    Walmart    300
7   1961-01-01    Walmart    400
10  1961-02-01    Walmart    800

df = df[df['Company'] != 'HEB']

print(df)
          date    Company  Revenue
1   1960-01-01    Walmart    800
2   1960-01-01  Food Lion    400
4   1960-02-01    Walmart    300
5   1960-02-01  Food Lion    600
7   1961-01-01    Walmart    400
8   1961-01-01  Food Lion    900
10  1961-02-01    Walmart    800
11  1961-02-01  Food Lion    600