Python替代R-mutate
我想把R代码转换成Python。R中的代码是Python替代R-mutate,python,r,Python,R,我想把R代码转换成Python。R中的代码是 df %>% mutate(N = if_else(Interval != lead(Interval) | row_number() == n(), criteria/Count, NA_real_)) 我用Python编写了以下内容: import pandas as pd import numpy as np df = pd.read_table('Fd.csv', sep=',') for i in range(1,len(df.
df %>% mutate(N = if_else(Interval != lead(Interval) | row_number() == n(), criteria/Count, NA_real_))
我用Python编写了以下内容:
import pandas as pd
import numpy as np
df = pd.read_table('Fd.csv', sep=',')
for i in range(1,len(df.Interval)-1):
x = df.Interval[i]
n = df.Interval[i+1]
if x != n | x==df.Interval.tail().all():
df['new']=(df.criteria/df.Count)
else:
df['new']='NaN'
df.to_csv (r'dataframe.csv', index = False, header=True)
但是,输出将返回所有NAN
下面是数据的样子
Interval | Count | criteria
0 0 0
0 1 0
0 2 0
0 3 0
1 4 1
1 5 2
1 6 3
1 7 4
2 8 1
2 9 2
3 10 3
这就是我想要得到的(我也需要考虑最后一行)
如果有人能帮我找出错误,我将不胜感激。从0开始索引 首先要注意的是Python从0开始索引(与从1开始的R相反)。因此,您需要修改for循环的索引范围 2。指定行索引 打电话的时候
df['new']=(df.criteria/df.Count)
或
您正在设置/获取“新建”列中的所有值。但是,您只想在某些行中设置该值。因此,您需要指定行
3。工作示例
import pandas as pd
df = pd.DataFrame()
df["Interval"] = [0,0,0,0,1,1,1,1,2,2,3]
df["Count"] = [0,1,2,3,4,5,6,7,8,9,10]
df["criteria"] = [0,0,0,0,1,2,3,4,1,2,3]
df["new"] = ["NaN"] * len(df.Interval)
last_row = len(df.Interval) - 1
for row in range(0, len(df.Interval)):
current_value = df.Interval[row]
next_value = df.Interval[min(row + 1, last_row)]
if (current_value != next_value) or (row == last_row):
result = df.loc[row, 'criteria'] / df.loc[row, 'Count']
df.loc[row, 'new'] = result
非常感谢你的帮助。成功了!
df['new']='NaN'
import pandas as pd
df = pd.DataFrame()
df["Interval"] = [0,0,0,0,1,1,1,1,2,2,3]
df["Count"] = [0,1,2,3,4,5,6,7,8,9,10]
df["criteria"] = [0,0,0,0,1,2,3,4,1,2,3]
df["new"] = ["NaN"] * len(df.Interval)
last_row = len(df.Interval) - 1
for row in range(0, len(df.Interval)):
current_value = df.Interval[row]
next_value = df.Interval[min(row + 1, last_row)]
if (current_value != next_value) or (row == last_row):
result = df.loc[row, 'criteria'] / df.loc[row, 'Count']
df.loc[row, 'new'] = result