Python 列间减法操作,在数据帧中创建新列
我的当前数据帧:Python 列间减法操作,在数据帧中创建新列,python,dataframe,Python,Dataframe,我的当前数据帧: Adj Close High high_shift high_>_high Date 2017-01-03 14.676315 15.65 14.70 True 2017-01-04 14.676315 15.68 15.65 True 2017-01-05 14.913031 15.91 15.68 True
Adj Close High high_shift high_>_high
Date
2017-01-03 14.676315 15.65 14.70 True
2017-01-04 14.676315 15.68 15.65 True
2017-01-05 14.913031 15.91 15.68 True
2017-01-06 14.827814 15.92 15.91 True
2017-01-09 14.515349 15.60 15.92 False
2017-01-10 14.657379 15.68 15.60 True
2017-01-11 14.827814 15.68 15.68 False
2017-01-12 15.055059 16.25 15.68 True
2017-01-13 14.846750 15.95 16.25 False
2017-01-16 14.913031 15.75 15.95 False
如果high列的值大于high_shift列中的值,我希望通过减去这些值来创建一个新列从列rows adj close减去行值从high_shift*100列开始
举个例子:
if (df.High > df.high_shift):
df['new_column'] = (df['Adj Close'] - df['high_shift'])*100
如果high列的值不大于high_shift列的值,我希望新列行中的值为0
我正在尝试以下代码行,但出现错误,甚至无法打印结果:
for i in df['high_>_high'], df['Close'], df['high_shift']:
if df['high_>_high'][i]:
(df['Close'][i] - df['high_shift'][i])*100
ValueError:序列的真值不明确。使用a.empty、a.bool()、a.item()、a.any()或a.all()
我能够制作一个列(high>\uhigh\u shift)来显示high>
high_shift,但我不能将此作为通过减去其他列来创建新列的条件,在处理Pandas中的数字数据时,最好避免Python循环(
,for
/,while
),而使用Pandas的矢量化函数
在这种情况下,可以使用,它将给定边界之外的值带到边界
df['new_column'] = ((df['Adj Close'] - df['high_shift']) * 100).clip(0)
# (.clip(0) could also go after the inner parentheses)
或者,可以在以后将柱剪裁到位
df['new_column'] = (df['Adj Close'] - df['high_shift']).clip(0) * 100
df['new_column'].clip(0, inplace=True)
对于比将值截断到某个范围更一般的情况,可以在序列(或数据帧)上使用。(如果您想了解更多信息,该页将讨论Pandas提供的许多索引。)
df['new']=((df['Adj Close']-df['high_shift'])*100)。剪辑(0)
#将列“new”中0到0以下的所有值设置为0
df['new'][df['new']<0]=0
使用:
输出
Date Adj Close High high_shift high_>_high new_column
0 2017-01-03 14.676315 15.65 14.70 True 95.0
1 2017-01-04 14.676315 15.68 15.65 True 3.0
2 2017-01-05 14.913031 15.91 15.68 True 23.0
3 2017-01-06 14.827814 15.92 15.91 True 1.0
4 2017-01-09 14.515349 15.60 15.92 False 0.0
5 2017-01-10 14.657379 15.68 15.60 True 8.0
6 2017-01-11 14.827814 15.68 15.68 False 0.0
7 2017-01-12 15.055059 16.25 15.68 True 57.0
8 2017-01-13 14.846750 15.95 16.25 False 0.0
9 2017-01-16 14.913031 15.75 15.95 False 0.0
df['new_column'] = np.where(df.High > df.high_shift, (df.High - df.high_shift) * 100, 0)
print(df)
Date Adj Close High high_shift high_>_high new_column
0 2017-01-03 14.676315 15.65 14.70 True 95.0
1 2017-01-04 14.676315 15.68 15.65 True 3.0
2 2017-01-05 14.913031 15.91 15.68 True 23.0
3 2017-01-06 14.827814 15.92 15.91 True 1.0
4 2017-01-09 14.515349 15.60 15.92 False 0.0
5 2017-01-10 14.657379 15.68 15.60 True 8.0
6 2017-01-11 14.827814 15.68 15.68 False 0.0
7 2017-01-12 15.055059 16.25 15.68 True 57.0
8 2017-01-13 14.846750 15.95 16.25 False 0.0
9 2017-01-16 14.913031 15.75 15.95 False 0.0