Python 在dataframe中复制列_Python_Python 3.x_Python 2.7_Pandas_Dataframe

Python 在dataframe中复制列

python python-3.x python-2.7 pandas dataframe

Python 在dataframe中复制列,python,python-3.x,python-2.7,pandas,dataframe,Python,Python 3.x,Python 2.7,Pandas,Dataframe,我想在Python数据帧中切片和复制列。我的数据框如下所示： 1928 1928.1 1929 1929.1 1930 1930.1 0 0 0 0 0 0 0 1 1 3 3 2 2 2 2 4 1 3 0 1 2 我想把它做成表格 1928 1928.1 1929 1929.1 1930 1

我想在Python数据帧中切片和复制列。我的数据框如下所示：

     1928  1928.1  1929  1929.1  1930  1930.1
 0    0     0       0     0       0     0
 1    1     3       3     2       2     2
 2    4     1       3     0       1     2

我想把它做成表格

     1928  1928.1  1929 1929.1 1930 1930.1
 0   0     0            
 1   1     3          
 2   4     1                    
 3   0     0
 4   3     2
 5   3     0
 6   0     0
 7   2     2
 8   1     2

这基本上意味着我想将列'1929'，'1929.1'，'1930'，'1930.1'中的值移到列'1928'和'1928.1'下

出于同样的原因，我将代码编写为

   [In]x=2
       y=2
       b=3
       c=x-1
       for a in range(0,2):
            df.iloc[b:(b+3),0:x]=df.iloc[0:3,x:(x+y)]
            x=x+2
            b=b+3
   [In] df
   [Out] 
     1928  1928.1  1929  1929.1  1930  1930.1
 0    0     0       0     0       0     0
 1    1     3       3     2       2     2
 2    4     1       3     0       1     2

列内不进行复制。如何修改我的代码？？？

如果您同意使用新的数据帧，只需连接列即可：

df1 = df[['1928','1928.1']]
df2 = df[['1929','1929.1']]
df2.columns = ['1928','1928.1']
df3 = df[['1930','1930.1']]
df3.columns = ['1928','1928.1']

df = pd.concat([df1,df2,df3])

我认为这是最容易阅读的方法。您可以覆盖原始数据帧并丢弃其他数据帧。

一种方法是使用

itertools.chain

：

from itertools import chain

cols = df.columns

res = pd.DataFrame({cols[0]: list(chain.from_iterable(df.iloc[:, ::2].T.values)),
                    cols[1]: list(chain.from_iterable(df.iloc[:, 1::2].T.values))})\
        .join(pd.DataFrame(columns=cols[2:]))

print(res)

   1928  1928.1 1929 1929.1 1930 1930.1
0     0       0  NaN    NaN  NaN    NaN
1     1       3  NaN    NaN  NaN    NaN
2     4       1  NaN    NaN  NaN    NaN
3     0       0  NaN    NaN  NaN    NaN
4     3       2  NaN    NaN  NaN    NaN
5     3       0  NaN    NaN  NaN    NaN
6     0       0  NaN    NaN  NaN    NaN
7     2       2  NaN    NaN  NaN    NaN
8     1       2  NaN    NaN  NaN    NaN

设置

numpy.vstack
带附加
：

df[['1928', '1928.1']].append(
    pd.DataFrame(
        np.vstack([vals[::2], vals[1::2]]), columns = ['1928', '1928.1']
    )
)

   1928  1928.1
0     0       0
1     1       3
2     4       1
0     0       0
1     3       2
2     3       0
3     0       0
4     2       2
5     1       2

按列名的前四个字符分组

#def key(s):
#    return s[:4]
#gb = df.groupby(key, axis=1)
gb = df.groupby(by=df.columns.str[:4], axis=1)

n_cols = len(df.columns) // len(gb)
col_names = df.iloc[:,:n_cols].columns

对于每个组的DataFrame，重命名列并连接-这将生成一个只有两列的新DataFrame

dz = pd.concat(d.rename(columns=dict(item for item in zip(d.columns,col_names))) for g,d in gb)
dz.index = range(len(dz))

将适用于六个以上的列。
依赖于具有相同列数的所有组。

依赖于按标签对列进行排序。

您尝试过中的任何方法吗？我猜它属于同一个数据帧，因此可能会排除concat、merge或Join选项。顺序对您重要吗？是的，先生！！是否要保留空列？

dz = pd.concat(d.rename(columns=dict(item for item in zip(d.columns,col_names))) for g,d in gb)
dz.index = range(len(dz))

frames = []
for g,d in gb:
    d.columns = col_names
    frames.append(d)
dy = pd.concat(frames)
dy.index = range(len(dy))