Python 乘法数据帧行取决于此行中的值
我有这样一个数据帧:Python 乘法数据帧行取决于此行中的值,python,pandas,dataframe,Python,Pandas,Dataframe,我有这样一个数据帧: df = pd.DataFrame({'col1': [69, 77, 88], 'col2': ['bar34', 'barf30', 'barfoo29'], 'col3': [4, 2, 5]}) print(df, '\n') 我需要根据“col3”中的值乘以行。期望输出: col1 col2 col3 0 69 bar34 4 1 69
df = pd.DataFrame({'col1': [69, 77, 88],
'col2': ['bar34', 'barf30', 'barfoo29'],
'col3': [4, 2, 5]})
print(df, '\n')
我需要根据“col3”中的值乘以行。期望输出:
col1 col2 col3
0 69 bar34 4
1 69 bar34 4
2 69 bar34 4
3 69 bar34 4
4 77 barf30 2
5 77 barf30 2
6 88 barfoo29 5
7 88 barfoo29 5
8 88 barfoo29 5
9 88 barfoo29 5
10 88 barfoo29 5
我只有一个解决方案,但我敢肯定它根本没有效率
import numpy as np
# Get columns list
cols = df.columns.to_list()
# Loop for each row to multiply
for index, row in df.iterrows():
# Loop for each new row we get
full_array = []
for new_row in range(row['col3']):
row_lst = [row[col_name] for col_name in cols]
full_array.append(row_lst)
numpy_data = np.array(full_array)
# Drop used row
df = df.drop([index])
# Creating mini_df
mini_df = pd.DataFrame(numpy_data, columns=columns)
# Concat with main dataframe
df = pd.concat([df, mini_df], ignore_index=True)
df = df.reset_index(drop=True)
print(df)
那么,这里有一个解决方案
实际上,您不需要设置索引:
df.loc[df.index.repeat(df['col3'])]。重置索引(drop=True)
也可以,因为索引。repeat
重复现有索引
import numpy as np
# Get columns list
cols = df.columns.to_list()
# Loop for each row to multiply
for index, row in df.iterrows():
# Loop for each new row we get
full_array = []
for new_row in range(row['col3']):
row_lst = [row[col_name] for col_name in cols]
full_array.append(row_lst)
numpy_data = np.array(full_array)
# Drop used row
df = df.drop([index])
# Creating mini_df
mini_df = pd.DataFrame(numpy_data, columns=columns)
# Concat with main dataframe
df = pd.concat([df, mini_df], ignore_index=True)
df = df.reset_index(drop=True)
print(df)
col1 col2 col3
0 69 bar34 4
1 69 bar34 4
2 69 bar34 4
3 69 bar34 4
4 77 barf30 2
5 77 barf30 2
6 88 barfoo29 5
7 88 barfoo29 5
8 88 barfoo29 5
9 88 barfoo29 5
10 88 barfoo29 5
df = df.set_index(df.col3)
print(
df.reindex(df.index.repeat(df.col3))
.reset_index(drop=True)
)
# suggested by @anky,
df.loc[df.index.repeat(df.col3)]
col1 col2 col3
0 69 bar34 4
1 69 bar34 4
2 69 bar34 4
3 69 bar34 4
4 77 barf30 2
5 77 barf30 2
6 88 barfoo29 5
7 88 barfoo29 5
8 88 barfoo29 5
9 88 barfoo29 5
10 88 barfoo29 5