Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/303.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫:根据条件在每个组内创建新行_Python_Pandas_Loops_Dataframe_Group By - Fatal编程技术网

Python 熊猫:根据条件在每个组内创建新行

Python 熊猫:根据条件在每个组内创建新行,python,pandas,loops,dataframe,group-by,Python,Pandas,Loops,Dataframe,Group By,我有一个日期框(df) 它是这样的: ID From_num To_num Date 0 James 78 96 2020-05-12 1 James 420 78 2020-02-02 2 James Started 420 2019-06-18 3 Max 298 36 2019-06-20 4 Max 36 78 2019-01-

我有一个日期框(df)

它是这样的:

       ID From_num  To_num        Date
0   James       78      96  2020-05-12
1   James      420      78  2020-02-02
2   James  Started     420  2019-06-18
3     Max      298      36  2019-06-20
4     Max       36      78  2019-01-30
5     Max      298      36  2018-10-23
6     Max  Started     298  2018-08-29
7    Park  Started     311  2020-05-21
8     Tom       60     150  2019-11-22
9     Tom      520     520  2019-08-26
10    Tom       99      78  2018-12-11
11    Tom  Started      99  2018-10-09
12   Wong  Started      39  2019-02-01
对于每个人(组),我希望在每个组的第一行(“ID”)上创建一个新的重复行,“ID”、“From_num”和“to_num”列中创建的行的值应与前一行相同,但“Date”值是旧的第一行的日期加上一天,例如对于James,新创建的行值是:“James”“78”“96”“2020-05-13”,与其余数据相同,因此我的预期结果是:

       ID From_num  To_num        Date
0   James       78      96  2020-05-13  # row added, Date + 1
1   James       78      96  2020-05-12
2   James      420      78  2020-02-02
3   James  Started     420  2019-06-18
4     Max      298      36  2019-06-21  # row added, Date + 1
5     Max      298      36  2019-06-20
6     Max       36      78  2019-01-30
7     Max      298      36  2018-10-23
8     Max  Started     298  2018-08-29
9    Park  Started     311  2020-05-22  # Row added, Date + 1
10   Park  Started     311  2020-05-21
11    Tom       60     150  2019-11-23  # Row added, Date + 1
12    Tom       60     150  2019-11-22
13    Tom      520     520  2019-08-26
14    Tom       99      78  2018-12-11
15    Tom  Started      99  2018-10-09
16   Wong  Started      39  2019-02-02  # Row added Date + 1
17   Wong  Started      39  2019-02-01
我希望订单/顺序与我的预期结果相同。如果你有什么好主意,请帮忙。非常感谢使用:

df['Date'] = pd.to_datetime(df['Date'])
df['order'] = df.groupby('ID').cumcount().add(1)

df1 = (
    df.groupby('ID', as_index=False).first()
    .assign(Date=lambda x: x['Date'] + pd.Timedelta(days=1), order=0)
)

df1 = pd.concat([df, df1]).sort_values(['ID', 'order'], ignore_index=True).drop('order', 1)

详细信息:

df['Date'] = pd.to_datetime(df['Date'])
df['order'] = df.groupby('ID').cumcount().add(1)

df1 = (
    df.groupby('ID', as_index=False).first()
    .assign(Date=lambda x: x['Date'] + pd.Timedelta(days=1), order=0)
)

df1 = pd.concat([df, df1]).sort_values(['ID', 'order'], ignore_index=True).drop('order', 1)
Date
列转换为pandas
datetime
系列,并在列
ID
上使用,并在数据帧中的每个组中施加总计顺序

print(df)
       ID From_num  To_num       Date  order
0   James       78      96 2020-05-13      1
1   James       78      96 2020-05-12      2
2   James      420      78 2020-02-02      3
3   James  Started     420 2019-06-18      4
4     Max      298      36 2019-06-21      1
5     Max      298      36 2019-06-20      2
6     Max       36      78 2019-01-30      3
7     Max      298      36 2018-10-23      4
8     Max  Started     298 2018-08-29      5
9    Park  Started     311 2020-05-22      1
10   Park  Started     311 2020-05-21      2
11    Tom       60     150 2019-11-23      1
12    Tom       60     150 2019-11-22      2
13    Tom      520     520 2019-08-26      3
14    Tom       99      78 2018-12-11      4
15    Tom  Started      99 2018-10-09      5
16   Wong  Started      39 2019-02-02      1
17   Wong  Started      39 2019-02-01      2
通过在列
ID
上使用创建一个新的数据帧
df1
,并使用和分配
order=0
进行聚合,并将
Date
1天的天数递增

print(df1)
      ID From_num  To_num       Date  order
0  James       78      96 2020-05-14      0 # Date incremented by 1 days
1    Max      298      36 2019-06-22      0 # and ordering added
2   Park  Started     311 2020-05-23      0
3    Tom       60     150 2019-11-24      0
4   Wong  Started      39 2019-02-03      0
使用concat对数据帧
df
df1
进行排序,并使用对列
ID
order
上的数据帧进行排序

print(df1)
       ID From_num  To_num       Date
0   James       78      96 2020-05-14
1   James       78      96 2020-05-13
2   James       78      96 2020-05-12
3   James      420      78 2020-02-02
4   James  Started     420 2019-06-18
5     Max      298      36 2019-06-22
6     Max      298      36 2019-06-21
7     Max      298      36 2019-06-20
8     Max       36      78 2019-01-30
9     Max      298      36 2018-10-23
10    Max  Started     298 2018-08-29
11   Park  Started     311 2020-05-23
12   Park  Started     311 2020-05-22
13   Park  Started     311 2020-05-21
14    Tom       60     150 2019-11-24
15    Tom       60     150 2019-11-23
16    Tom       60     150 2019-11-22
17    Tom      520     520 2019-08-26
18    Tom       99      78 2018-12-11
19    Tom  Started      99 2018-10-09
20   Wong  Started      39 2019-02-03
21   Wong  Started      39 2019-02-02
22   Wong  Started      39 2019-02-01