Python 在执行复杂功能后创建新的数据帧_Python_Pandas_Numpy_Dataframe

Python 在执行复杂功能后创建新的数据帧

python pandas numpy dataframe

Python 在执行复杂功能后创建新的数据帧,python,pandas,numpy,dataframe,Python,Pandas,Numpy,Dataframe,我有以下df中的轨迹数据： vid points 0 0 [[2,4], [5,6], [8,9]] 1 1 [[10,11], [12,13], [14,15]] 2 2 [[1,2], [3,4], [8,1]] 3 3 [[21,10], [8,8], [4,3]] 4 4 [[15,2], [16,1], [17,3]] 每个轨迹都是一个列表点，由vid识别我有一个函数，它计

我有以下df中的轨迹数据：

   vid        points
0   0        [[2,4], [5,6], [8,9]]
1   1        [[10,11], [12,13], [14,15]]
2   2        [[1,2], [3,4], [8,1]]
3   3        [[21,10], [8,8], [4,3]]
4   4        [[15,2], [16,1], [17,3]]

每个轨迹都是一个列表点，由vid识别

我有一个函数，它计算两条轨迹之间的距离，让这个距离函数是方法_dist（x，y）；x、你是两个崔斯

这就是该方法的工作原理：

x = df.iloc[0]["points"].tolist()
y = df.iloc[3]["points"].tolist()

method_dist(x, y)

现在，方法_dist将计算索引0和索引3（非vid）处轨迹之间的距离

因为在我的df中有100行，如果可能的话，我想自动化这个过程

如果我给出一个索引列表[0,1,3]，我想创建一个函数或循环，它计算索引0和索引1处轨迹之间的距离；然后是索引0和3，然后是1和3；直到计算出每对之间的距离，我想将距离存储在df2中，如下所示：

注意我们不计算任何地方的点之间的距离，“点”下的每个单元都是一条完整的轨迹，函数法_dist是计算整个轨迹之间的距离

     traj1_idx       traj2_idx        distance
  0    0             1                some_val
  1    0             3                some_val
  2    1             3                some_val

或者，即使我必须手动计算一对之间的距离，我也希望创建一个新的df，每次我获取两条轨迹时，该df至少会将计算出的距离和轨迹对追加到新df中

     traj1_idx       traj2_idx        distance
  0    0             1                some_val
  1    0             3                some_val
  2    1             3                some_val

请让我知道如何得到预期的结果，或者如果我需要改变什么

谢谢

制作一个自定义的

类

，其中您将减法定义为

method\u dist

def method_dist(x, y):
    return abs(x - y)

class Trajectory(object):
    def __init__(self, a):
        self.data = np.asarray(a)

    def __sub__(self, other):
        return method_dist(self.data, other.data)

    def __repr__(self):
        return '☺ {}'.format(self.data.shape)

然后创建一系列这些东西

s = df.points.apply(Trajectory)
s

0    ☺ (3, 2)
1    ☺ (3, 2)
2    ☺ (3, 2)
3    ☺ (3, 2)
4    ☺ (3, 2)
Name: points, dtype: object

定义一个方便的函数来自动处理不同的差异组合

def get_combo_diffs(a, idx):
    """`a` is an array of Trajectory objects.  The return
    statement shows a slice of `a` minus another slice of `a`.
    numpy will execute the underlying objects __sub__ method
    for each pair and return an array of the results."""

    # this bit just finds all combinations of 2 at a time from `idx`
    idx = np.asarray(idx)
    n = idx.size
    i, j = np.triu_indices(n, 1)

    return a[idx[i]] - a[idx[j]]

然后用它

get_combo_diffs(s.values, [0, 1, 3])

array([array([[8, 7],
       [7, 7],
       [6, 6]]),
       array([[19,  6],
       [ 3,  2],
       [ 4,  6]]),
       array([[11,  1],
       [ 4,  5],
       [10, 12]])], dtype=object)

第一个要素

获取组合差异（s值，[0,1,3]）

这是两者的结果

first = np.array([[2, 4], [5, 6], [8, 9]])
second = np.array([[10, 11], [12, 13], [14, 15]])

method_dist(first, second)

array([[8, 7],
       [7, 7],
       [6, 6]])

或同等地

x, y = s.loc[0], s.loc[1]
x - y

array([[8, 7],
       [7, 7],
       [6, 6]])

我在编一些有趣的东西谢谢，唐恩，希望我有你的大脑。我不知道代码在做什么，你在考虑一行中每对点之间的距离吗？