Pandas 匹配并重新排列数据帧中的值
我有一个如下所示的数据帧:Pandas 匹配并重新排列数据帧中的值,pandas,dataframe,Pandas,Dataframe,我有一个如下所示的数据帧: A Country price1 A Country price2 B Country price1 B Country price2 C Country price1 0 19-12-04 0.0 19-12-05 1.7 19-12-05 2.6 19-12-06 3.2 19-12-05 0.1 1 19-12-03 1.5 19-12-04 1.7 19-12-04
A Country price1 A Country price2 B Country price1 B Country price2 C Country price1
0 19-12-04 0.0 19-12-05 1.7 19-12-05 2.6 19-12-06 3.2 19-12-05 0.1
1 19-12-03 1.5 19-12-04 1.7 19-12-04 2.6 19-12-05 3.2 19-12-04 0.1
2 19-12-02 1.5 19-12-03 1.7 19-12-03 2.6 19-12-04 3.1 19-12-03 0.1
3 19-12-01 1.5 19-12-02 1.8 19-12-02 2.7 19-12-03 3.2 19-12-02 0.1
4 19-11-29 1.5 19-12-01 1.7 19-11-29 2.6 19-12-02 3.2 19-12-01 0.1
5 19-11-28 1.6 19-11-29 1.7 19-11-28 2.6 19-11-29 3.1 19-11-29 0.1
6 19-11-27 1.6 19-11-28 1.7 19-11-27 2.6 19-11-28 3.2 19-11-28 0.1
7 19-11-26 1.6 19-11-27 1.7 19-11-26 2.6 19-11-27 3.2 19-11-27 0.2
8 19-11-25 1.5 19-11-26 1.7 19-11-25 2.6 19-11-26 3.2 19-11-26 0.2
9 19-11-24 1.5 19-11-25 1.7 19-11-22 2.6 19-11-25 3.2 19-11-25 0.2
10 19-11-22 1.5 19-11-24 1.7 19-11-21 2.6 19-11-22 3.1 19-11-24 0.2
A Country price1 A Country price2 B Country price1 B Country price2 C Country price1
0 19-12-06 ? 19-12-06 ? 19-12-06 ? 19-12-06 3.2 19-12-06 ?
1 19-12-05 ? 19-12-05 1.7 19-12-05 2.6 19-12-05 3.2 19-12-05 0.1
2 19-12-04 0.0 19-12-04 1.7 19-12-04 2.6 19-12-04 3.1 19-12-04 0.1
3 19-12-03 1.5 19-12-03 1.7 19-12-03 2.6 19-12-03 3.2 19-12-03 0.1
4 19-12-02 1.5 19-12-02 1.8 19-12-02 2.7 19-12-02 3.2 19-12-02 0.1
5 19-12-01 1.5 19-12-01 1.7 19-12-01 ? 19-12-01 ? 19-12-01 0.1
6 19-11-29 1.5 19-11-29 1.7 19-11-29 2.6 19-11-29 3.1 19-11-29 0.1
7 19-11-28 1.6 19-11-28 1.7 19-11-28 2.6 19-11-28 3.2 19-11-28 0.1
8 19-11-27 1.6 19-11-27 1.7 19-11-27 2.6 19-11-27 3.2 19-11-27 0.2
9 19-11-26 1.6 19-11-26 1.7 19-11-26 2.6 19-11-26 3.2 19-11-26 0.2
10 19-11-25 1.5 19-11-25 1.7 19-11-25 2.6 19-11-25 3.2 19-11-25 0.2
11 19-11-24 1.5 19-11-24 1.7 19-11-24 ? 19-11-24 ? 19-11-24 0.2
12 19-11-23 ? 19-11-23 ? 19-11-23 ? 19-11-23 ? 19-11-23 ?
13 19-11-22 1.5 19-11-22 ? 19-11-22 2.6 19-11-22 3.1 19-11-22 ?
14 19-11-21 ? 19-11-21 ? 19-11-21 2.6 19-11-21 ? 19-11-21 ?
每个国家/地区列具有不同的行值。
现在,我想按日期匹配和重新排列值。我想用“?”马克替换空白。我想要的结果如下:
A Country price1 A Country price2 B Country price1 B Country price2 C Country price1
0 19-12-04 0.0 19-12-05 1.7 19-12-05 2.6 19-12-06 3.2 19-12-05 0.1
1 19-12-03 1.5 19-12-04 1.7 19-12-04 2.6 19-12-05 3.2 19-12-04 0.1
2 19-12-02 1.5 19-12-03 1.7 19-12-03 2.6 19-12-04 3.1 19-12-03 0.1
3 19-12-01 1.5 19-12-02 1.8 19-12-02 2.7 19-12-03 3.2 19-12-02 0.1
4 19-11-29 1.5 19-12-01 1.7 19-11-29 2.6 19-12-02 3.2 19-12-01 0.1
5 19-11-28 1.6 19-11-29 1.7 19-11-28 2.6 19-11-29 3.1 19-11-29 0.1
6 19-11-27 1.6 19-11-28 1.7 19-11-27 2.6 19-11-28 3.2 19-11-28 0.1
7 19-11-26 1.6 19-11-27 1.7 19-11-26 2.6 19-11-27 3.2 19-11-27 0.2
8 19-11-25 1.5 19-11-26 1.7 19-11-25 2.6 19-11-26 3.2 19-11-26 0.2
9 19-11-24 1.5 19-11-25 1.7 19-11-22 2.6 19-11-25 3.2 19-11-25 0.2
10 19-11-22 1.5 19-11-24 1.7 19-11-21 2.6 19-11-22 3.1 19-11-24 0.2
A Country price1 A Country price2 B Country price1 B Country price2 C Country price1
0 19-12-06 ? 19-12-06 ? 19-12-06 ? 19-12-06 3.2 19-12-06 ?
1 19-12-05 ? 19-12-05 1.7 19-12-05 2.6 19-12-05 3.2 19-12-05 0.1
2 19-12-04 0.0 19-12-04 1.7 19-12-04 2.6 19-12-04 3.1 19-12-04 0.1
3 19-12-03 1.5 19-12-03 1.7 19-12-03 2.6 19-12-03 3.2 19-12-03 0.1
4 19-12-02 1.5 19-12-02 1.8 19-12-02 2.7 19-12-02 3.2 19-12-02 0.1
5 19-12-01 1.5 19-12-01 1.7 19-12-01 ? 19-12-01 ? 19-12-01 0.1
6 19-11-29 1.5 19-11-29 1.7 19-11-29 2.6 19-11-29 3.1 19-11-29 0.1
7 19-11-28 1.6 19-11-28 1.7 19-11-28 2.6 19-11-28 3.2 19-11-28 0.1
8 19-11-27 1.6 19-11-27 1.7 19-11-27 2.6 19-11-27 3.2 19-11-27 0.2
9 19-11-26 1.6 19-11-26 1.7 19-11-26 2.6 19-11-26 3.2 19-11-26 0.2
10 19-11-25 1.5 19-11-25 1.7 19-11-25 2.6 19-11-25 3.2 19-11-25 0.2
11 19-11-24 1.5 19-11-24 1.7 19-11-24 ? 19-11-24 ? 19-11-24 0.2
12 19-11-23 ? 19-11-23 ? 19-11-23 ? 19-11-23 ? 19-11-23 ?
13 19-11-22 1.5 19-11-22 ? 19-11-22 2.6 19-11-22 3.1 19-11-22 ?
14 19-11-21 ? 19-11-21 ? 19-11-21 2.6 19-11-21 ? 19-11-21 ?
对不起,我在编码方面完全是个新手。列名对我来说并不重要,
所以,我想要的另一个结果是:
A Country price1 price2 price1 price2 price1
0 19-12-06 ? ? ? 3.2 ?
1 19-12-05 ? 1.7 2.6 3.2 0.1
2 19-12-04 0.0 1.7 2.6 3.1 0.1
3 19-12-03 1.5 1.7 2.6 3.2 0.1
4 19-12-02 1.5 1.8 2.7 3.2 0.1
5 19-12-01 1.5 1.7 ? ? 0.1
6 19-11-29 1.5 1.7 2.6 3.1 0.1
7 19-11-28 1.6 1.7 2.6 3.2 0.1
8 19-11-27 1.6 1.7 2.6 3.2 0.2
9 19-11-26 1.6 1.7 2.6 3.2 0.2
10 19-11-25 1.5 1.7 2.6 3.2 0.2
11 19-11-24 1.5 1.7 ? ? 0.2
12 19-11-23 ? ? ? ? ?
13 19-11-22 1.5 ? 2.6 3.1 ?
14 19-11-21 ? ? 2.6 ? ?
我怎样才能做到这一点呢?的想法是压缩每一对和不成对的列,在列表理解中,通过第一列创建索引,最后通过
concat
连接在一起,并创建DatetimeIndex
a = df.columns[::2]
b = df.columns[1::2]
dfs = [df.loc[:, x].set_index(x[0], drop=False)[x[1]] for x in zip(a, b)]
df = pd.concat(dfs, axis=1, sort=False).fillna('?')
df.index = pd.to_datetime(df.index,format='%y-%m-%d')
df = df.sort_index()
print (df)
price1 price2 price1.1 price2.1 price1.2
2019-11-21 ? ? 2.6 ? ?
2019-11-22 1.5 ? 2.6 3.1 ?
2019-11-24 1.5 1.7 ? ? 0.2
2019-11-25 1.5 1.7 2.6 3.2 0.2
2019-11-26 1.6 1.7 2.6 3.2 0.2
2019-11-27 1.6 1.7 2.6 3.2 0.2
2019-11-28 1.6 1.7 2.6 3.2 0.1
2019-11-29 1.5 1.7 2.6 3.1 0.1
2019-12-01 1.5 1.7 ? ? 0.1
2019-12-02 1.5 1.8 2.7 3.2 0.1
2019-12-03 1.5 1.7 2.6 3.2 0.1
2019-12-04 0 1.7 2.6 3.1 0.1
2019-12-05 ? 1.7 2.6 3.2 0.1
2019-12-06 ? ? ? 3.2 ?
你需要具体说明你想要什么。你有不同的日期,突然在你的价格中出现“?”。因此,请详细询问您的问题,并向我们提供您的测试,例如:(a)拆分为不同的系列(每个国家一个),其中索引是日期,(b)使用索引加入(c)用
?
填充缺少的值应该行得通。您为什么有重复的列名?哦,我在编码方面完全是新手,所以我真的想要不重复的列,但我无法实现。我有点理解你的逻辑-你想按最小和最大日期重新采样,但你的数据是错误的-它有错误和重复的列名。想一想这个问题,然后重新编辑。非常感谢。我很快会核对你的答案。现在是午夜,我明天再试一次。晚安,耶斯雷尔。非常感谢你。再见tomorrow@MonoNeu-Elogy它意味着一些重复的日期时间,因此您可以更改dfs=[df.loc[:,x]。在zip(a,b)中为x设置索引(x[0],drop=False)[x[1]]
到dfs=[df.loc[:,x].groupby(x[0])[x[1].sum()
用于重复数据的合计和datetimes@MonoNeuElogy-您可以尝试将df.index=pd.to_datetime(df.index,格式='%y-%m-%d')
更改为df.index=pd.to_datetime(df.index,yearfirst=True)
Oh。我刚读了你的评论。我现在试过你的密码。而且它工作得很好!非常感谢你!