要使用python或pyspark计算两个不同行上的值的差异吗
各位: 因此,我在python中有一个列表类型的对象:要使用python或pyspark计算两个不同行上的值的差异吗,python,list,arraylist,pyspark,Python,List,Arraylist,Pyspark,各位: 因此,我在python中有一个列表类型的对象: [[10, 542.5354710621561], [200, 11.802396794545745], [700, 1.561175174358397], [2000, 0.20926429043267342], [10, 1107.0197845783787], [200, 24.2886201681616], [700, 3.1771001799962972], [2000, 0.4405625905369205]]
[[10, 542.5354710621561],
[200, 11.802396794545745],
[700, 1.561175174358397],
[2000, 0.20926429043267342],
[10, 1107.0197845783787],
[200, 24.2886201681616],
[700, 3.1771001799962972],
[2000, 0.4405625905369205]]
我需要做的是按组循环并计算200-10和11.802-542.535的差异。然后,计算700-200和1.561-11.802。最后,计算2000-700和0.209-1.561。对于第二组,我也必须这样做
我期望的输出是:
{ID_10000_10_200: float_value_here, ID_10000_200_700: float_value_here, ID_10000_700_2000: float_value_here, ID_20000_10_200: float_value_here, ID_20000_200_700: float_value_here, ID_20000_700_2000: float_value_here}
你能建议实现这一目标的最佳方法吗?谢谢 我们可以使用range()
在列表中迭代,并根据前面的索引计算值。因为您没有指定要存储的值,所以我计算了这两个值,并在最终结果中将它们存储在元组中
data = [[10000, 10, 549.7374891412558], [10000, 200, 11.606709354357797], [10000, 700, 1.6392354197665262], [10000, 2000, 0.2042362064342665], [20000, 10, 1361.9743632614627], [20000, 200, 22.664201537351765], [20000, 700, 3.0681569263316266], [20000, 2000, 0.5177459808387871]]
result = {}
for i in range(1, len(data)):
d_id = f'ID_{data[i][0]}_{data[i - 1][1]}_{data[i][1]}'
calculations = (data[i][1] - data[i - 1][1], data[i][2] - data[i - 1][2])
result[d_id] = calculations
产出:
{'ID_10000_10_200': (190, -538.130779786898), 'ID_10000_200_700': (500, -9.96747393459127), 'ID_10000_700_2000': (1300, -1.4349992133322598), 'ID_20000_2000_10': (-1990, 1361.7701270550285), 'ID_20000_10_200': (190, -1339.310161724111), 'ID_20000_200_700': (500, -19.59604461102014), 'ID_20000_700_2000': (1300, -2.5504109454928394)}
您可以尝试以下方法:
nums_list = [[10, 542.5354710621561],
[200, 11.802396794545745],
[700, 1.561175174358397],
[2000, 0.20926429043267342],
[10, 1107.0197845783787],
[200, 24.2886201681616],
[700, 3.1771001799962972],
[2000, 0.4405625905369205]]
nums_list = [nums_list[i:i + 4] for i in range(0, len(nums_list), 4)]
final_diffs_object = {}
for y in range(len(nums_list)):
for x in range(1, len(nums_list[y])):
diff = nums_list[y][x][0] - nums_list[y][x - 1][0]
diff2 = nums_list[y][x][1] - nums_list[y][x - 1][1]
final_diff = diff - diff2
print(final_diff)
final_diffs_object["ID" + "_" + str((y+1) * 10000) + "_" + str(nums_list[y][x - 1][0]) + "_" + str(nums_list[y][x][0])] = final_diff
print(final_diffs_object)
样本输出:
{'ID_10000_10_200': 720.7330742676104, 'ID_10000_200_700': 510.24122162018733, 'ID_10000_700_2000': 1301.3519108839257, 'ID_20000_10_200': 1272.7311644102172, 'ID_20000_200_700': 521.1115199881654, 'ID_20000_700_2000': 1302.7365375894594}
ID_10000_10_200是190(200-10)的占位符吗?如何定义ID_10000
或ID_20000
部分,这些ID号重要吗?是的,数字10000和20000重要,但我可以将它们添加到列表中:`[[10000,10549.7374891412558],[10000,200,11.606709354357797],[10000,700,1.6392354197665262],[10000,2000,0.2042362064342665],[20000,101361.9743632614627],[20000,200,22.664201537351765],[20000,700,3.0681569263316266],[20000,2000,0.51774598083871],``并且没有ID_10000_10_200将是公式d=2*((log(200)-log(10))/(log(542.535)-log(11.802))结果的占位符),例如,您已经确定了要执行的步骤;您在哪里尝试编码这些步骤?