Python 如何有条件地转换数据帧列

Python 如何有条件地转换数据帧列,python,python-3.x,pandas,dataframe,for-loop,Python,Python 3.x,Pandas,Dataframe,For Loop,我有两列要循环使用,“交易量对冲”和“单位对冲”。对于每一行,如果“Unit_hedge”中的数据表示每天数千桶,我想将与“Unit_hedge”位于同一行的Volume_hedge中的数字除以1000 我尝试过在枚举列和if语句之间循环。正如我所说,我在前两行工作,但在其余的行不工作 df2 = DataFrame(x) columns_to_select = ['Volume_hedge', 'Unit_hedge'] for i, row in enumerate(columns_to_s

我有两列要循环使用,“交易量对冲”和“单位对冲”。对于每一行,如果“Unit_hedge”中的数据表示每天数千桶,我想将与“Unit_hedge”位于同一行的Volume_hedge中的数字除以1000

我尝试过在枚举列和if语句之间循环。正如我所说,我在前两行工作,但在其余的行不工作

df2 = DataFrame(x)
columns_to_select = ['Volume_hedge', 'Unit_hedge']
for i, row in enumerate(columns_to_select):
    if df2['Unit_hedge'].loc[i] == 'Thousands of Barrels per Day':
        new_row = df2['Volume_hedge'].loc[i] / 1000
    else:
        none
    df2['Volume_hedge'].loc[i] = new_row
print(df2[columns_to_select].loc[0:8])
预期成果:

  Volume_hedge                    Unit_hedge
0         0.03  Thousands of Barrels per Day
1        0.024  Thousands of Barrels per Day
2        0.024  Thousands of Barrels per Day
3        0.024  Thousands of Barrels per Day
4        0.024  Thousands of Barrels per Day
5        0.024  Thousands of Barrels per Day
6        0.024  Thousands of Barrels per Day
7     32850000                   (MMBtu/Bbl)
8      4404000                   (MMBtu/Bbl)
实际结果:

 Volume_hedge                    Unit_hedge
0         0.03  Thousands of Barrels per Day
1        0.024  Thousands of Barrels per Day
2           24  Thousands of Barrels per Day
3           24  Thousands of Barrels per Day
4           24  Thousands of Barrels per Day
5           24  Thousands of Barrels per Day
6           24  Thousands of Barrels per Day
7     32850000                   (MMBtu/Bbl)
8      4404000                   (MMBtu/Bbl)
您应在此处使用:

这将把单位对冲等于每天数千桶的所有行除以1000,其他行保持不变


这还有一个优点,即不需要迭代执行,使用pandas和numpy列选择两个元素的列表时速度更快。当您枚举它时,我将从0变为1。这将仅将函数应用于前两行

如果要遍历这些行,则应改用iterrows函数。像这样做

for i, row in df2.iterrows():
    if row['Unit_hedge'] == 'Thousands of Barrels per Day':
        new_row = row['Volume_hedge'] / 1000
    df2['Volume_hedge'].iloc[i] = new_row

但是,使用apply而不是在每一行中循环是更好的选择,因为迭代非常慢。此外,在遍历数据帧时设置列值也不是首选

请格式化代码。
df['volume_hedge'][df['Unit_hedge'] == 'Thousands of Barrels per Day'] = 
df['volume_hedge'][df['Unit_hedge'] == 'Thousands of Barrels per Day']/1000
df['volume_hedge'][df['Unit_hedge'] == 'Thousands of Barrels per Day'] = 
df['volume_hedge'][df['Unit_hedge'] == 'Thousands of Barrels per Day']/1000