Python 在dataframe中复制列
我想在Python数据帧中切片和复制列。我的数据框如下所示:Python 在dataframe中复制列,python,python-3.x,python-2.7,pandas,dataframe,Python,Python 3.x,Python 2.7,Pandas,Dataframe,我想在Python数据帧中切片和复制列。我的数据框如下所示: 1928 1928.1 1929 1929.1 1930 1930.1 0 0 0 0 0 0 0 1 1 3 3 2 2 2 2 4 1 3 0 1 2 我想把它做成表格 1928 1928.1 1929 1929.1 1930 1
1928 1928.1 1929 1929.1 1930 1930.1
0 0 0 0 0 0 0
1 1 3 3 2 2 2
2 4 1 3 0 1 2
我想把它做成表格
1928 1928.1 1929 1929.1 1930 1930.1
0 0 0
1 1 3
2 4 1
3 0 0
4 3 2
5 3 0
6 0 0
7 2 2
8 1 2
这基本上意味着我想将列'1929','1929.1','1930','1930.1'中的值移到列'1928'和'1928.1'下
出于同样的原因,我将代码编写为
[In]x=2
y=2
b=3
c=x-1
for a in range(0,2):
df.iloc[b:(b+3),0:x]=df.iloc[0:3,x:(x+y)]
x=x+2
b=b+3
[In] df
[Out]
1928 1928.1 1929 1929.1 1930 1930.1
0 0 0 0 0 0 0
1 1 3 3 2 2 2
2 4 1 3 0 1 2
列内不进行复制。如何修改我的代码???如果您同意使用新的数据帧,只需连接列即可:
df1 = df[['1928','1928.1']]
df2 = df[['1929','1929.1']]
df2.columns = ['1928','1928.1']
df3 = df[['1930','1930.1']]
df3.columns = ['1928','1928.1']
df = pd.concat([df1,df2,df3])
我认为这是最容易阅读的方法。您可以覆盖原始数据帧并丢弃其他数据帧。一种方法是使用
itertools.chain
:
from itertools import chain
cols = df.columns
res = pd.DataFrame({cols[0]: list(chain.from_iterable(df.iloc[:, ::2].T.values)),
cols[1]: list(chain.from_iterable(df.iloc[:, 1::2].T.values))})\
.join(pd.DataFrame(columns=cols[2:]))
print(res)
1928 1928.1 1929 1929.1 1930 1930.1
0 0 0 NaN NaN NaN NaN
1 1 3 NaN NaN NaN NaN
2 4 1 NaN NaN NaN NaN
3 0 0 NaN NaN NaN NaN
4 3 2 NaN NaN NaN NaN
5 3 0 NaN NaN NaN NaN
6 0 0 NaN NaN NaN NaN
7 2 2 NaN NaN NaN NaN
8 1 2 NaN NaN NaN NaN
设置
numpy.vstack
带附加
:
df[['1928', '1928.1']].append(
pd.DataFrame(
np.vstack([vals[::2], vals[1::2]]), columns = ['1928', '1928.1']
)
)
1928 1928.1
0 0 0
1 1 3
2 4 1
0 0 0
1 3 2
2 3 0
3 0 0
4 2 2
5 1 2
按列名的前四个字符分组
#def key(s):
# return s[:4]
#gb = df.groupby(key, axis=1)
gb = df.groupby(by=df.columns.str[:4], axis=1)
n_cols = len(df.columns) // len(gb)
col_names = df.iloc[:,:n_cols].columns
对于每个组的DataFrame,重命名列并连接-这将生成一个只有两列的新DataFrame
dz = pd.concat(d.rename(columns=dict(item for item in zip(d.columns,col_names))) for g,d in gb)
dz.index = range(len(dz))
将适用于六个以上的列。
依赖于具有相同列数的所有组。
依赖于按标签对列进行排序。您尝试过中的任何方法吗?我猜它属于同一个数据帧,因此可能会排除concat、merge或Join选项。顺序对您重要吗?是的,先生!!是否要保留空列?
dz = pd.concat(d.rename(columns=dict(item for item in zip(d.columns,col_names))) for g,d in gb)
dz.index = range(len(dz))
frames = []
for g,d in gb:
d.columns = col_names
frames.append(d)
dy = pd.concat(frames)
dy.index = range(len(dy))