要使用python或pyspark计算两个不同行上的值的差异吗

要使用python或pyspark计算两个不同行上的值的差异吗,python,list,arraylist,pyspark,Python,List,Arraylist,Pyspark,各位: 因此,我在python中有一个列表类型的对象: [[10, 542.5354710621561], [200, 11.802396794545745], [700, 1.561175174358397], [2000, 0.20926429043267342], [10, 1107.0197845783787], [200, 24.2886201681616], [700, 3.1771001799962972], [2000, 0.4405625905369205]]

各位:

因此,我在python中有一个列表类型的对象:

[[10, 542.5354710621561],
 [200, 11.802396794545745],
 [700, 1.561175174358397],
 [2000, 0.20926429043267342],
 [10, 1107.0197845783787],
 [200, 24.2886201681616],
 [700, 3.1771001799962972],
 [2000, 0.4405625905369205]]
我需要做的是按组循环并计算200-1011.802-542.535的差异。然后,计算700-2001.561-11.802。最后,计算2000-7000.209-1.561。对于第二组,我也必须这样做

我期望的输出是:

{ID_10000_10_200: float_value_here, ID_10000_200_700: float_value_here, ID_10000_700_2000: float_value_here, ID_20000_10_200: float_value_here, ID_20000_200_700: float_value_here, ID_20000_700_2000: float_value_here}
你能建议实现这一目标的最佳方法吗?谢谢

我们可以使用
range()
在列表中迭代,并根据前面的索引计算值。因为您没有指定要存储的值,所以我计算了这两个值,并在最终结果中将它们存储在
元组中

data = [[10000, 10, 549.7374891412558], [10000, 200, 11.606709354357797], [10000, 700, 1.6392354197665262], [10000, 2000, 0.2042362064342665], [20000, 10, 1361.9743632614627], [20000, 200, 22.664201537351765], [20000, 700, 3.0681569263316266], [20000, 2000, 0.5177459808387871]]

result = {}

for i in range(1, len(data)):
    d_id = f'ID_{data[i][0]}_{data[i - 1][1]}_{data[i][1]}'
    calculations = (data[i][1] - data[i - 1][1], data[i][2] - data[i - 1][2])
    result[d_id] = calculations
产出:

{'ID_10000_10_200': (190, -538.130779786898), 'ID_10000_200_700': (500, -9.96747393459127), 'ID_10000_700_2000': (1300, -1.4349992133322598), 'ID_20000_2000_10': (-1990, 1361.7701270550285), 'ID_20000_10_200': (190, -1339.310161724111), 'ID_20000_200_700': (500, -19.59604461102014), 'ID_20000_700_2000': (1300, -2.5504109454928394)}
您可以尝试以下方法:

nums_list = [[10, 542.5354710621561],
             [200, 11.802396794545745],
             [700, 1.561175174358397],
             [2000, 0.20926429043267342],
             [10, 1107.0197845783787],
             [200, 24.2886201681616],
             [700, 3.1771001799962972],
             [2000, 0.4405625905369205]]

nums_list = [nums_list[i:i + 4] for i in range(0, len(nums_list), 4)]
final_diffs_object = {}
for y in range(len(nums_list)):
    for x in range(1, len(nums_list[y])):
        diff = nums_list[y][x][0] - nums_list[y][x - 1][0]
        diff2 = nums_list[y][x][1] - nums_list[y][x - 1][1]
        final_diff = diff - diff2
        print(final_diff)
        final_diffs_object["ID" + "_" + str((y+1) * 10000) + "_" + str(nums_list[y][x - 1][0]) + "_" + str(nums_list[y][x][0])] = final_diff
print(final_diffs_object)
样本输出:

{'ID_10000_10_200': 720.7330742676104, 'ID_10000_200_700': 510.24122162018733, 'ID_10000_700_2000': 1301.3519108839257, 'ID_20000_10_200': 1272.7311644102172, 'ID_20000_200_700': 521.1115199881654, 'ID_20000_700_2000': 1302.7365375894594}

ID_10000_10_200是190(200-10)的占位符吗?如何定义
ID_10000
ID_20000
部分,这些ID号重要吗?是的,数字10000和20000重要,但我可以将它们添加到列表中:`[[10000,10549.7374891412558],[10000,200,11.606709354357797],[10000,700,1.6392354197665262],[10000,2000,0.2042362064342665],[20000,101361.9743632614627],[20000,200,22.664201537351765],[20000,700,3.0681569263316266],[20000,2000,0.51774598083871],``并且没有ID_10000_10_200将是公式d=2*((log(200)-log(10))/(log(542.535)-log(11.802))结果的占位符),例如,您已经确定了要执行的步骤;您在哪里尝试编码这些步骤?