Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/350.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何随时间创建重复的数据帧并将其映射到时间列表?_Python_Pandas_Date_Dataframe - Fatal编程技术网

Python 如何随时间创建重复的数据帧并将其映射到时间列表?

Python 如何随时间创建重复的数据帧并将其映射到时间列表?,python,pandas,date,dataframe,Python,Pandas,Date,Dataframe,我已经创建了以下数据框: df = pd.DataFrame() df['date'] = pd.date_range(start="2019-12-01", end="2019-12-20", freq='D') 我有以下两行 Line Start End Amount A 2019-12-01 2019-12-08 100 B 2019-12-06 2019-12-15 200 我希望得到以下结果: Output: date

我已经创建了以下数据框:

df = pd.DataFrame()
df['date'] = pd.date_range(start="2019-12-01", end="2019-12-20", freq='D')
我有以下两行

Line Start        End         Amount
A    2019-12-01  2019-12-08   100
B    2019-12-06  2019-12-15   200
我希望得到以下结果:

Output:
    date         amount   line
0   2019-12-01   100       A
1   2019-12-02   100       A
2   2019-12-03   100       A
3   2019-12-04   100       A
4   2019-12-05   100       A
5   2019-12-06   300       A,B
6   2019-12-07   300       A,B
7   2019-12-08   300       A,B
8   2019-12-09   200       B
9   2019-12-10   200       B 
10  2019-12-11   200       B
11  2019-12-12   200       B
12  2019-12-13   200       B
13  2019-12-14   200       B
14  2019-12-15   200       B
15  2019-12-16   0
16  2019-12-17   0
17  2019-12-18   0
18  2019-12-19   0
19  2019-12-20   0
我能做些什么来实现这一点?我曾尝试使用“地图”功能,但我无法得到结果


对不起,伙计们,如果这两行都有索引,我如何在结果中添加该列?

下面是我的交叉合并/连接解决方案:

(pd.merge(*[d.assign(dummy=1) for d in [df, df1]], 
          on='dummy')
   .query('Start <= date <= End')
   .groupby('date')['Amount'].sum()
   .reindex(df['date'], fill_value=0)
   .reset_index()
)

试试这个。假设第二个列表是一个数据帧

import pandas as pd
df = pd.DataFrame()
df['date'] = pd.date_range(start="2019-12-01", end="2019-12-20", freq='D')

df2 = pd.DataFrame({"Start":["2019-12-01","2019-12-06"],"End":["2019-12-08","2019-12-15"],"Amount":[100,200]})
df2["Start"] = pd.to_datetime(df2["Start"])
df2["End"] = pd.to_datetime(df2["End"])

def f(x):

    df_ = df2[(df2.Start<= x) & (df2.End>=x)]["Amount"]
    v = df_.values
    i = df_.index.values      
    return v,i

s=df.date.apply(lambda x: pd.Series({"amount":sum(f(x)[0]),"line":','.join(map(str, f(x)[1]))}))
df= pd.concat([df,s],axis=1)

我相信,这将提供您想要的结果,数据结构如下:

df['date'] = pd.date_range(start="2019-12-01", end="2019-12-20", freq='D')
df2 = pd.DataFrame({"Start":["2019-12-01","2019-12-06"],"End":["2019-12-08","2019-12-15"],"Amount":[100,200]})

df2.End = df2.End.apply(lambda x: pd.Timestamp(x))
df2.Start = df2.Start.apply(lambda x: pd.Timestamp(x)) 

df['AM1']  = df.apply(lambda x: df2.Amount[0] if (x.date >= df2.Start[0] and x.date <= df2.End[0]) else 0 , axis = 1)
df['AM2']  = df.apply(lambda x: df2.Amount[1] if (x.date >= df2.Start[1] and x.date <= df2.End[1]) else 0 , axis = 1)
df['Amount'] = df.iloc[:, 1:3].sum(axis=1)
df['line'] = df.groupby(['date']).apply(lambda x: '0' if x.AM1[0] > 0 and x.AM2[0] == 0 else '1' if x.AM2[0] > 0 and x.AM1[0] == 0 else '' if x.AM1[0] == 0 and x.AM2[0] == 0 else '0, 1').to_list() 
df.drop(columns=['AM1', 'AM2'], inplace=True)

如果我有100行怎么办?我必须创建一个循环来生成100 df吗?不,你不需要循环。[df,df1]中d的
行仅在两个数据帧上循环。很抱歉,我不太了解[df,df1]]中d的(pd.merge(*[d.assign(dummy=1)for d,on='dummy'),您能解释一下吗?这相当于
pd.merge(df.assign(dummy=1),df1.assign(dummy=1),on='dummy')
。我只是在炫耀:D。而且,
df.assign(dummy=1)
几乎等同于
df['dummy']=1
没有真正触及
df
。对不起,如果这两行有索引,我如何在结果中添加该列?你说的“两行有索引”是什么意思?我已经更新了问题。你能看一看吗?很好,很高兴编码。如果行索引更改为字符串,如a和B。在这种情况下,index.values不能我们不能执行这样的结果
import pandas as pd
df = pd.DataFrame()
df['date'] = pd.date_range(start="2019-12-01", end="2019-12-20", freq='D')

df2 = pd.DataFrame({"Start":["2019-12-01","2019-12-06"],"End":["2019-12-08","2019-12-15"],"Amount":[100,200]})
df2["Start"] = pd.to_datetime(df2["Start"])
df2["End"] = pd.to_datetime(df2["End"])

def f(x):

    df_ = df2[(df2.Start<= x) & (df2.End>=x)]["Amount"]
    v = df_.values
    i = df_.index.values      
    return v,i

s=df.date.apply(lambda x: pd.Series({"amount":sum(f(x)[0]),"line":','.join(map(str, f(x)[1]))}))
df= pd.concat([df,s],axis=1)
          date  amount line
0  2019-12-01     100    0
1  2019-12-02     100    0
2  2019-12-03     100    0
3  2019-12-04     100    0
4  2019-12-05     100    0
5  2019-12-06     300  0,1
6  2019-12-07     300  0,1
7  2019-12-08     300  0,1
8  2019-12-09     200    1
9  2019-12-10     200    1
10 2019-12-11     200    1
11 2019-12-12     200    1
12 2019-12-13     200    1
13 2019-12-14     200    1
14 2019-12-15     200    1
15 2019-12-16       0     
16 2019-12-17       0     
17 2019-12-18       0     
18 2019-12-19       0     
19 2019-12-20       0 
df['date'] = pd.date_range(start="2019-12-01", end="2019-12-20", freq='D')
df2 = pd.DataFrame({"Start":["2019-12-01","2019-12-06"],"End":["2019-12-08","2019-12-15"],"Amount":[100,200]})

df2.End = df2.End.apply(lambda x: pd.Timestamp(x))
df2.Start = df2.Start.apply(lambda x: pd.Timestamp(x)) 

df['AM1']  = df.apply(lambda x: df2.Amount[0] if (x.date >= df2.Start[0] and x.date <= df2.End[0]) else 0 , axis = 1)
df['AM2']  = df.apply(lambda x: df2.Amount[1] if (x.date >= df2.Start[1] and x.date <= df2.End[1]) else 0 , axis = 1)
df['Amount'] = df.iloc[:, 1:3].sum(axis=1)
df['line'] = df.groupby(['date']).apply(lambda x: '0' if x.AM1[0] > 0 and x.AM2[0] == 0 else '1' if x.AM2[0] > 0 and x.AM1[0] == 0 else '' if x.AM1[0] == 0 and x.AM2[0] == 0 else '0, 1').to_list() 
df.drop(columns=['AM1', 'AM2'], inplace=True)
         date  Amount  line
0  2019-12-01     100     0
1  2019-12-02     100     0
2  2019-12-03     100     0
3  2019-12-04     100     0
4  2019-12-05     100     0
5  2019-12-06     300  0, 1
6  2019-12-07     300  0, 1
7  2019-12-08     300  0, 1
8  2019-12-09     200     1
9  2019-12-10     200     1
10 2019-12-11     200     1
11 2019-12-12     200     1
12 2019-12-13     200     1
13 2019-12-14     200     1
14 2019-12-15     200     1
15 2019-12-16       0      
16 2019-12-17       0      
17 2019-12-18       0      
18 2019-12-19       0      
19 2019-12-20       0