Python替代R-mutate_Python_R - Fatal编程技术网

Python替代R-mutate

python r

Python替代R-mutate,python,r,Python,R,我想把R代码转换成Python。R中的代码是 df %>% mutate(N = if_else(Interval != lead(Interval) | row_number() == n(), criteria/Count, NA_real_)) 我用Python编写了以下内容： import pandas as pd import numpy as np df = pd.read_table('Fd.csv', sep=',') for i in range(1,len(df.

我想把R代码转换成Python。R中的代码是

df %>% mutate(N = if_else(Interval != lead(Interval) | row_number() == n(), criteria/Count, NA_real_))

我用Python编写了以下内容：

import pandas as pd
import numpy as np
df = pd.read_table('Fd.csv', sep=',')

for i in range(1,len(df.Interval)-1):
    x = df.Interval[i]
    n = df.Interval[i+1]
    if x != n | x==df.Interval.tail().all():
        df['new']=(df.criteria/df.Count)
    else:
        df['new']='NaN'
df.to_csv (r'dataframe.csv', index = False, header=True)

但是，输出将返回所有NAN

下面是数据的样子

Interval | Count    |   criteria    
0        0               0                             
0        1               0                            
0        2               0                             
0        3               0                             
1        4               1                             
1        5               2                             
1        6               3                            
1        7               4                             
2        8               1                          
2        9               2       
3        10              3

这就是我想要得到的（我也需要考虑最后一行）

如果有人能帮我找出错误，我将不胜感激。从0开始索引

首先要注意的是Python从0开始索引（与从1开始的R相反）。因此，您需要修改for循环的索引范围

2。指定行索引

打电话的时候

df['new']=(df.criteria/df.Count)

或

您正在设置/获取“新建”列中的所有值。但是，您只想在某些行中设置该值。因此，您需要指定行

3。工作示例

import pandas as pd

df = pd.DataFrame()
df["Interval"] = [0,0,0,0,1,1,1,1,2,2,3]
df["Count"] = [0,1,2,3,4,5,6,7,8,9,10]
df["criteria"] = [0,0,0,0,1,2,3,4,1,2,3]
df["new"] = ["NaN"] * len(df.Interval)

last_row = len(df.Interval) - 1
for row in range(0, len(df.Interval)):
    current_value = df.Interval[row]
    next_value = df.Interval[min(row + 1, last_row)]
    if (current_value != next_value) or (row == last_row): 
        result = df.loc[row, 'criteria'] / df.loc[row, 'Count']
        df.loc[row, 'new'] = result

非常感谢你的帮助。成功了！

df['new']='NaN'

import pandas as pd

df = pd.DataFrame()
df["Interval"] = [0,0,0,0,1,1,1,1,2,2,3]
df["Count"] = [0,1,2,3,4,5,6,7,8,9,10]
df["criteria"] = [0,0,0,0,1,2,3,4,1,2,3]
df["new"] = ["NaN"] * len(df.Interval)

last_row = len(df.Interval) - 1
for row in range(0, len(df.Interval)):
    current_value = df.Interval[row]
    next_value = df.Interval[min(row + 1, last_row)]
    if (current_value != next_value) or (row == last_row): 
        result = df.loc[row, 'criteria'] / df.loc[row, 'Count']
        df.loc[row, 'new'] = result