Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/322.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 基于现有列的行中的值,创建包含这些行的新列_Python_Pandas_Dataframe - Fatal编程技术网

Python 基于现有列的行中的值,创建包含这些行的新列

Python 基于现有列的行中的值,创建包含这些行的新列,python,pandas,dataframe,Python,Pandas,Dataframe,我正在与Pandas一起使用Python,并遇到以下问题。我有一个数据框,其中有大量行描述每个日期的加密货币数据。到达最后一个日期后,将为另一种加密货币启动一个新的时间序列,所有时间序列都在同一列中。我正在寻找一种处理数据帧的方法,以便对于每个令牌\日期,所有加密货币数据都显示在一行中,这样总行数将等于令牌\日期的总行数 目前df如下所示: token_id token_caption token_date token_price_usd token_marketcap_usd 64

我正在与Pandas一起使用Python,并遇到以下问题。我有一个数据框,其中有大量行描述每个日期的加密货币数据。到达最后一个日期后,将为另一种加密货币启动一个新的时间序列,所有时间序列都在同一列中。我正在寻找一种处理数据帧的方法,以便对于每个令牌\日期,所有加密货币数据都显示在一行中,这样总行数将等于令牌\日期的总行数

目前df如下所示:

token_id    token_caption   token_date  token_price_usd token_marketcap_usd
64          WAN Wanchain    2019-06-24  0.3817          40414601.0
64          WAN Wanchain    2019-07-01  0.3644          38683920.0
64          WAN Wanchain    2019-07-08  0.3557          37759781.0
64          WAN Wanchain    2019-07-15  0.2625          27824362.0
64          WAN Wanchain    2019-07-22  0.2545          27036722.0
...
57          MAID            2017-07-24  0.3775          170824959.0
57          MAID            2017-07-31  0.2917          132012254.0
57          MAID            2017-08-07  0.3589          162410652.0
57          MAID            2017-08-14  0.3763          170283706.0
57          MAID            2017-08-21  0.4615          208873303.0
...
我正在寻找代码来实现类似的功能。:(列拆分将执行大约100次,最终得到约201列)

如果有任何帮助,我将不胜感激。我是Python的初学者,对如何实现这一点没有概念


谢谢大家!

如果将索引设置为
['token\u date','token\u caption']
并将标题取消堆叠,使其成为一列,则会得到一个非常干净的
多索引
列,其中包含您要查找的内容:

In [144]: df
Out[144]:
   token_id token_caption  token_date  token_price_usd  token_marketcap_usd
0        64  WAN Wanchain  2019-06-24           0.3817           40414601.0
1        64  WAN Wanchain  2019-07-01           0.3644           38683920.0
2        64  WAN Wanchain  2019-07-08           0.3557           37759781.0
3        64  WAN Wanchain  2019-07-15           0.2625           27824362.0
4        64  WAN Wanchain  2019-07-22           0.2545           27036722.0
5        57          MAID  2019-06-24           0.3775          170824959.0
6        57          MAID  2019-07-01           0.2917          132012254.0
7        57          MAID  2019-07-08           0.3589          162410652.0
8        57          MAID  2019-07-15           0.3763          170283706.0
9        57          MAID  2019-07-22           0.4615          208873303.0

In [145]: df.set_index(["token_date", "token_caption"])[["token_price_usd", "token_marketcap_usd"]].unstack().swaplevel(axis=1)
Out[145]:
token_caption            MAID    WAN Wanchain                MAID        WAN Wanchain
              token_price_usd token_price_usd token_marketcap_usd token_marketcap_usd
token_date
2019-06-24             0.3775          0.3817         170824959.0          40414601.0
2019-07-01             0.2917          0.3644         132012254.0          38683920.0
2019-07-08             0.3589          0.3557         162410652.0          37759781.0
2019-07-15             0.3763          0.2625         170283706.0          27824362.0
2019-07-22             0.4615          0.2545         208873303.0          27036722.0
为什么不使用:

给定数据

token_id    token_caption   token_date  token_price_usd token_marketcap_usd
64          WAN_Wanchain    2019-06-24  0.3817          40414601.0
64          WAN_Wanchain    2019-07-01  0.3644          38683920.0
64          WAN_Wanchain    2019-07-08  0.3557          37759781.0
64          WAN_Wanchain    2019-07-15  0.2625          27824362.0
64          WAN_Wanchain    2019-07-22  0.2545          27036722.0
57          MAID            2019-06-24  0.3775          170824959.0
57          MAID            2019-07-01  0.2917          132012254.0
57          MAID            2019-07-08  0.3589          162410652.0
57          MAID            2019-07-15  0.3763          170283706.0
57          MAID            2019-07-22  0.4615          208873303.0
注:我重复了日期,因此有一些匹配

df.pivot("token_date", "token_caption", ["token_price_usd", "token_marketcap_usd"])
给予


我使用pivot_表并构造新列名:

df=df.pivot_table(index="token_date",columns="token_caption",values=["token_price_usd","token_marketcap_usd"])

token_marketcap_usd              token_price_usd             
token_caption                MAID WAN Wanchain            MAID WAN Wanchain
token_date                                                                 
2017-07-24            170824959.0          NaN          0.3775          NaN
2017-07-31            132012254.0          NaN          0.2917          NaN
2017-08-07            162410652.0          NaN          0.3589          NaN
2017-08-14            170283706.0          NaN          0.3763          NaN
2017-08-21            208873303.0          NaN          0.4615          NaN
2019-06-24                    NaN   40414601.0             NaN       0.3817
2019-07-01                    NaN   38683920.0             NaN       0.3644
2019-07-08                    NaN   37759781.0             NaN       0.3557
2019-07-15                    NaN   27824362.0             NaN       0.2625
2019-07-22                    NaN   27036722.0             NaN       0.2545

df.columns=[ lev2+" - "+lev1.split("_")[1].title() for lev1,lev2 in df.columns]
df.reindex(sorted(df.columns.values,reverse=True) ,axis=1)

            WAN Wanchain - Price  WAN Wanchain - Marketcap  MAID - Price  MAID - Marketcap
token_date                                                                                
2017-07-24                   NaN                       NaN        0.3775       170824959.0
2017-07-31                   NaN                       NaN        0.2917       132012254.0
2017-08-07                   NaN                       NaN        0.3589       162410652.0
2017-08-14                   NaN                       NaN        0.3763       170283706.0
2017-08-21                   NaN                       NaN        0.4615       208873303.0
2019-06-24                0.3817                40414601.0           NaN               NaN
2019-07-01                0.3644                38683920.0           NaN               NaN
2019-07-08                0.3557                37759781.0           NaN               NaN
2019-07-15                0.2625                27824362.0           NaN               NaN
2019-07-22                0.2545                27036722.0           NaN               NaN
最后,您可以应用“重置索引”

              token_price_usd              token_marketcap_usd             
token_caption            MAID WAN_Wanchain                MAID WAN_Wanchain
token_date                                                                 
2019-06-24             0.3775       0.3817         170824959.0   40414601.0
2019-07-01             0.2917       0.3644         132012254.0   38683920.0
2019-07-08             0.3589       0.3557         162410652.0   37759781.0
2019-07-15             0.3763       0.2625         170283706.0   27824362.0
2019-07-22             0.4615       0.2545         208873303.0   27036722.0
df=df.pivot_table(index="token_date",columns="token_caption",values=["token_price_usd","token_marketcap_usd"])

token_marketcap_usd              token_price_usd             
token_caption                MAID WAN Wanchain            MAID WAN Wanchain
token_date                                                                 
2017-07-24            170824959.0          NaN          0.3775          NaN
2017-07-31            132012254.0          NaN          0.2917          NaN
2017-08-07            162410652.0          NaN          0.3589          NaN
2017-08-14            170283706.0          NaN          0.3763          NaN
2017-08-21            208873303.0          NaN          0.4615          NaN
2019-06-24                    NaN   40414601.0             NaN       0.3817
2019-07-01                    NaN   38683920.0             NaN       0.3644
2019-07-08                    NaN   37759781.0             NaN       0.3557
2019-07-15                    NaN   27824362.0             NaN       0.2625
2019-07-22                    NaN   27036722.0             NaN       0.2545

df.columns=[ lev2+" - "+lev1.split("_")[1].title() for lev1,lev2 in df.columns]
df.reindex(sorted(df.columns.values,reverse=True) ,axis=1)

            WAN Wanchain - Price  WAN Wanchain - Marketcap  MAID - Price  MAID - Marketcap
token_date                                                                                
2017-07-24                   NaN                       NaN        0.3775       170824959.0
2017-07-31                   NaN                       NaN        0.2917       132012254.0
2017-08-07                   NaN                       NaN        0.3589       162410652.0
2017-08-14                   NaN                       NaN        0.3763       170283706.0
2017-08-21                   NaN                       NaN        0.4615       208873303.0
2019-06-24                0.3817                40414601.0           NaN               NaN
2019-07-01                0.3644                38683920.0           NaN               NaN
2019-07-08                0.3557                37759781.0           NaN               NaN
2019-07-15                0.2625                27824362.0           NaN               NaN
2019-07-22                0.2545                27036722.0           NaN               NaN