Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/281.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如果单个行的条件为真,则标记整个组_Python_Pandas_Dataframe_Group By_Pandas Groupby - Fatal编程技术网

Python 如果单个行的条件为真,则标记整个组

Python 如果单个行的条件为真,则标记整个组,python,pandas,dataframe,group-by,pandas-groupby,Python,Pandas,Dataframe,Group By,Pandas Groupby,我有一个有日期和公共假日的数据框 Date WeekNum Public_Holiday 1/1/2015 1 1 2/1/2015 1 0 3/1/2015 1 0 4/1/2015 1 0 5/1/2015 1 0 6/1/2015 1 0 7/1/2015 1 0 8/1/2015 2 0 9/1/2015 2 0 10/1/2015 2 0 11/1/2015 2 0 12/1

我有一个有日期和公共假日的数据框

Date    WeekNum Public_Holiday
1/1/2015    1   1
2/1/2015    1   0
3/1/2015    1   0
4/1/2015    1   0
5/1/2015    1   0
6/1/2015    1   0
7/1/2015    1   0
8/1/2015    2   0
9/1/2015    2   0
10/1/2015   2   0
11/1/2015   2   0
12/1/2015   2   0
13/1/2015   2   0
我必须创建一个名为Public_Holiday_Week的条件列,如果该周有公共假日,则该列应返回1

我想看到这样的输出

Date    WeekNum Public_Holiday  Public_Holiday_Week
1/1/2015    1   1               1
2/1/2015    1   0               1
3/1/2015    1   0               1
4/1/2015    1   0               1
5/1/2015    1   0               1
6/1/2015    1   0               1
7/1/2015    1   0               1
8/1/2015    2   0               0
9/1/2015    2   0               0
10/1/2015   2   0               0
11/1/2015   2   0               0
12/1/2015   2   0               0
13/1/2015   2   0               0
我试着用np.where

df['Public_Holiday_Week'] = np.where(df['Public_Holiday']==1,1,0)
但当一周中的其他日子不是公共假日时,它适用于0


我必须在这里申请滚球吗?感谢您的帮助

为了提高性能,请不要使用
groupby
,而是使用至少一个
1
获取所有
WeekNum
,然后选择值,最后将布尔掩码转换为
int
s:

weeks = df.loc[df['Public_Holiday'].eq(1), 'WeekNum']
df['Public_Holiday_Week'] = df['WeekNum'].isin(weeks).astype(int)

print (df)
         Date  WeekNum  Public_Holiday  Public_Holiday_Week
0    1/1/2015        1               1                    1
1    2/1/2015        1               0                    1
2    3/1/2015        1               0                    1
3    4/1/2015        1               0                    1
4    5/1/2015        1               0                    1
5    6/1/2015        1               0                    1
6    7/1/2015        1               0                    1
7    8/1/2015        2               0                    0
8    9/1/2015        2               0                    0
9   10/1/2015        2               0                    0
10  11/1/2015        2               0                    0
11  12/1/2015        2               0                    0
12  13/1/2015        2               0                    0
正如所指出的那样,@Mohamed Thasin ah在必要时可以按周分组,但随后会得到不同的输出,因为不同的数字:

groupby
max
,以及
map

groupby
transform
,使用
max
谢天谢地,当按月份和年份分组时,这将很好地概括:

df['Public_Holiday_Week'] = (
     df.groupby(['WeekNum', df.Date.str.split('/', 1).str[1]])
      .Public_Holiday.transform('max')
)
print(df)
         Date  WeekNum  Public_Holiday  Public_Holiday_Week
0    1/1/2015        1               1                    1
1    2/1/2015        1               0                    1
2    3/1/2015        1               0                    1
3    4/1/2015        1               0                    1
4    5/1/2015        1               0                    1
5    6/1/2015        1               0                    1
6    7/1/2015        1               0                    1
7    8/1/2015        2               0                    0
8    9/1/2015        2               0                    0
9   10/1/2015        2               0                    0
10  11/1/2015        2               0                    0
11  12/1/2015        2               0                    0
12  13/1/2015        2               0                    0
使用
resample
并完全跳过
WeekNum
列。
回答很好,但如果你用年数和周数分组,那就太好了。@MohamedThasinah-当然,给我一个秒一周的第一天是
2014-31-12
?为什么?这对于您的数据是必要的?df['weekOfYear']=df['ActivityDate'].dt.week#一年中的一周df['weekNum']=df.weekOfYear+(df.year%2015)*52@jezrael这就是我如何提出年度周和周数的原因,这正是有点奇怪的-weeks by
df['weekOfYear']=df['ActivityDate']样本数据中的.dt.week
和by column
WeekNum
是不同的。这就是为什么编辑我的解决方案的原因-
weeks
列与
WeekNum
@jezrael是的,很抱歉没有提前通知:)
df['Public_Holiday_Week'] = df.WeekNum.map(df.groupby('WeekNum').Public_Holiday.max())
print(df)
         Date  WeekNum  Public_Holiday  Public_Holiday_Week
0    1/1/2015        1               1                    1
1    2/1/2015        1               0                    1
2    3/1/2015        1               0                    1
3    4/1/2015        1               0                    1
4    5/1/2015        1               0                    1
5    6/1/2015        1               0                    1
6    7/1/2015        1               0                    1
7    8/1/2015        2               0                    0
8    9/1/2015        2               0                    0
9   10/1/2015        2               0                    0
10  11/1/2015        2               0                    0
11  12/1/2015        2               0                    0
12  13/1/2015        2               0                    0
df['Public_Holiday_Week'] = df.groupby('WeekNum').Public_Holiday.transform('max')
df['Public_Holiday_Week'] = (
     df.groupby(['WeekNum', df.Date.str.split('/', 1).str[1]])
      .Public_Holiday.transform('max')
)
print(df)
         Date  WeekNum  Public_Holiday  Public_Holiday_Week
0    1/1/2015        1               1                    1
1    2/1/2015        1               0                    1
2    3/1/2015        1               0                    1
3    4/1/2015        1               0                    1
4    5/1/2015        1               0                    1
5    6/1/2015        1               0                    1
6    7/1/2015        1               0                    1
7    8/1/2015        2               0                    0
8    9/1/2015        2               0                    0
9   10/1/2015        2               0                    0
10  11/1/2015        2               0                    0
11  12/1/2015        2               0                    0
12  13/1/2015        2               0                    0
df.assign(
    Public_Holiday_Week=
    df.resample('W-Wed', on='Date').Public_Holiday.transform('max')
)

         Date  WeekNum  Public_Holiday  Public_Holiday_Week
0  2015-01-01        1               1                    1
1  2015-01-02        1               0                    1
2  2015-01-03        1               0                    1
3  2015-01-04        1               0                    1
4  2015-01-05        1               0                    1
5  2015-01-06        1               0                    1
6  2015-01-07        1               0                    1
7  2015-01-08        2               0                    0
8  2015-01-09        2               0                    0
9  2015-01-10        2               0                    0
10 2015-01-11        2               0                    0
11 2015-01-12        2               0                    0
12 2015-01-13        2               0                    0