Python 计算具有不均匀间距点的三维渐变_Python_Numpy_Scipy_Numerical Methods_Derivative

Python 计算具有不均匀间距点的三维渐变

python numpy

Python 计算具有不均匀间距点的三维渐变,python,numpy,scipy,numerical-methods,derivative,Python,Numpy,Scipy,Numerical Methods,Derivative,目前，我有一个体积，由数百万个间隔不均匀的粒子组成，每个粒子都有一个属性（对于好奇的人来说是势），我想计算它的局部力（加速度） np.gradient只适用于等距数据，我在这里查看了：插值是必要的，但我在Numpy中找不到3D样条线实现产生代表性数据的一些代码： import numpy as np from scipy.spatial import cKDTree x = np.random.uniform(-10, 10, 10000) y = np.random.uniform

目前，我有一个体积，由数百万个间隔不均匀的粒子组成，每个粒子都有一个属性（对于好奇的人来说是势），我想计算它的局部力（加速度）

np.gradient只适用于等距数据，我在这里查看了：插值是必要的，但我在Numpy中找不到3D样条线实现

产生代表性数据的一些代码：

import numpy as np    
from scipy.spatial import cKDTree

x = np.random.uniform(-10, 10, 10000)
y = np.random.uniform(-10, 10, 10000)
z = np.random.uniform(-10, 10, 10000)
phi = np.random.uniform(-10**9, 0, 10000)

kdtree = cKDTree(np.c_[x,y,z])
_, index = kdtree.query([0,0,0], 32) #find 32 nearest particles to the origin
#find the gradient at (0,0,0) by considering the 32 nearest particles?

（我的问题非常类似于，但似乎没有解决方案，所以我想我应该再问一次。）

任何帮助都将不胜感激

直观地说，对于派生wrt one数据点，我将执行以下操作

对周围的数据进行切片：
```
data=phi[x_id-1:x_id+1，y_id-1:y_id+1，z_id-1:z_id+1]
```
。使用kdTre的方法看起来非常好，当然您也可以将其用于数据的子集
拟合一个3D多项式，你可能想看看。将切片中间的点定义为中心。计算到其他点的偏移。将它们作为坐标传递给多边形拟合
多项式在你的位置

这将是解决您问题的简单方法。然而，这可能会非常缓慢

编辑：

事实上，这似乎是通常的方法：

公认的答案是关于推导插值多项式。虽然显然多项式应该覆盖所有数据（范德蒙矩阵）。对你来说，这是不可能的，太多的数据。采用局部子集似乎非常合理。

很大程度上取决于潜在数据的信噪比。您的示例都是噪波，因此“拟合”任何对象都将始终是“过度拟合”。噪波的程度将决定您希望多边形拟合的程度（如lhk的答案）以及您希望克里格化的程度（使用pyKriging或其他方法）

我建议使用

query（x，距离上限）

而不是

query（x，k）

，因为这可能会防止由于集群而导致的一些不稳定性

我不是数学家，但我认为将多项式拟合到与距离相关的数据子集在空间上是不稳定的，特别是当多项式阶数增加时。这将使生成的梯度场不连续

下面是一个Julia实现，它可以满足您的要求

using NearestNeighbors

n = 3;
k = 32; # for stability use  k > n*(n+3)/2

# Take a point near the center of cube
point = 0.5 + rand(n)*1e-3;
data = rand(n, 10^4);
kdtree = KDTree(data);
idxs, dists = knn(kdtree, point, k, true);

# Coords of the k-Nearest Neighbors
X = data[:,idxs];

# Least-squares recipe for coefficients
 C = point * ones(1,k); # central node
dX = X - C;  # diffs from central node
 G = dX' * dX;
 F =  G .* G;
 v = diag(G);
 N = pinv(G) * G;
 N = eye(N) - N;
 a =  N * pinv(F*N) * v;  # ...these are the coeffs

# Use a temperature distribution of  T = 25.4 * r^2
# whose analytical gradient is   gradT = 25.4 * 2*x
X2 = X .* X;
C2 = C .* C;
T  = 25.4 * n * mean(X2, 1)';
Tc = 25.4 * n * mean(C2, 1)'; # central node
dT = T - Tc;       # diffs from central node

y = dX * (a .* dT);   # Estimated gradient
g = 2 * 25.4 * point; # Analytical

# print results
@printf "Estimated  Grad  = %s\n" string(y')
@printf "Analytical Grad  = %s\n" string(g')
@printf "Relative Error   = %.8f\n" vecnorm(g-y)/vecnorm(g)

此方法的相对误差约为1%。以下是几次运行的结果

Estimated  Grad  = [25.51670916224472 25.421038632006926 25.6711949674633]
Analytical Grad  = [25.41499027802736 25.44913042322385  25.448202594123806]
Relative Error   = 0.00559934

Estimated  Grad  = [25.310574056859014 25.549736360607493 25.368056350800604]
Analytical Grad  = [25.43200914200516  25.43243178887198  25.45061497749628]
Relative Error   = 0.00426558

更新
我对Python不太了解，但这里有一个翻译似乎很有效

import numpy as np
from scipy.spatial import KDTree

n = 3;
k = 32;

# fill the cube with random points
data = np.random.rand(10000,n)
kdtree = KDTree(data)

# pick a point (at the center of the cube)
point = 0.5 * np.ones((1,n))

# Coords of k-Nearest Neighbors
dists, idxs = kdtree.query(point, k)
idxs = idxs[0]
X = data[idxs,:]

# Calculate coefficients
C = (np.dot(point.T, np.ones((1,k)))).T # central node
dX= X - C                    # diffs from central node
G = np.dot(dX, dX.T)
F = np.multiply(G, G)
v = np.diag(G);
N = np.dot(np.linalg.pinv(G), G)
N = np.eye(k) - N;
a = np.dot(np.dot(N, np.linalg.pinv(np.dot(F,N))), v)  # these are the coeffs

#  Temperature distribution is  T = 25.4 * r^2
X2 = np.multiply(X, X)
C2 = np.multiply(C, C)
T  = 25.4 * n * np.mean(X2, 1).T
Tc = 25.4 * n * np.mean(C2, 1).T # central node
dT = T - Tc;       # diffs from central node

# Analytical gradient ==>  gradT = 2*25.4* x
g = 2 * 25.4 * point;
print( "g[]: %s" % (g) )

# Estimated gradient
y = np.dot(dX.T, np.multiply(a, dT))
print( "y[]: %s,   Relative Error = %.8f" % (y, np.linalg.norm(g-y)/np.linalg.norm(g)) )

更新#2
我想我可以用格式化的ASCII而不是LaTeX来写一些可以理解的东西

`Given a set of M vectors in n-dimensions (call them b_k), find a set of `coeffs (call them a_k) which yields the best estimate of the identity `matrix and the zero vector ` ` M ` (1) min ||E - I||, where E = sum a_k b_k b_k ` a_k k=1 ` ` M ` (2) min ||z - 0||, where z = sum a_k b_k ` a_k k=1 ` ` `Note that the basis vectors {b_k} are not required `to be normalized, orthogonal, or even linearly independent. ` `First, define the following quantities: ` ` B ==> matrix whose columns are the b_k ` G = B'.B ==> transpose of B times B ` F = G @ G ==> @ represents the hadamard product ` v = diag(G) ==> vector composed of diag elements of G ` `The above minimizations are equivalent to this linearly constrained problem ` ` Solve F.a = v ` s.t. G.a = 0 ` `Let {X} denote the Moore-Penrose inverse of X. `Then the solution of the linear problem can be written: ` ` N = I - {G}.G ==> projector into nullspace of G ` a = N . {F.N} . v ` `The utility of these coeffs is that they allow you to write `very simple expressions for the derivatives of a tensor field. ` ` `Let D be the del (or nabla) operator `and d be the difference operator wrt the central (aka 0th) node, `so that, for any scalar/vector/tensor quantity Y, we have: ` dY = Y - Y_0 ` `Let x_k be the position vector of the kth node. `And for our basis vectors, take ` b_k = dx_k = x_k - x_0. ` `Assume that each node has a field value associated with it ` (e.g. temperature), and assume a quadratic model [about x = x_0] ` for the field [g=gradient, H=hessian, ":" is the double-dot product] ` ` Y = Y_0 + (x-x_0).g + (x-x_0)(x-x_0):H/2 ` dY = dx.g + dxdx:H/2 ` D2Y = I:H ==> Laplacian of Y ` ` `Evaluate the model at the kth node ` ` dY_k = dx_k.g + dx_k dx_k:H/2 ` `Multiply by a_k and sum ` ` M M M ` sum a_k dY_k = sum a_k dx_k.g + sum a_k dx_k dx_k:H/2 ` k=1 k=1 k=1 ` ` = 0.g + I:H/2 ` = D2Y / 2 ` `Thus, we have a second order estimate of the Laplacian ` ` M ` Lap(Y_0) = sum 2 a_k dY_k ` k=1 ` ` `Now play the same game with a linear model ` dY_k = dx_k.g ` `But this time multiply by (a_k dx_k) and sum ` ` M M ` sum a_k dx_k dY_k = sum a_k dx_k dx_k.g ` k=1 k=1 ` ` = I.g ` = g ` ` `In general, the derivatives at the central node can be estimated as ` ` M ` D#Y = sum a_k dx_k#dY_k ` k=1 ` ` M ` D2Y = sum 2 a_k dY_k ` k=1 ` ` where ` # stands for the {dot, cross, or tensor} product ` yielding the {div, curl, or grad} of Y ` and ` D2Y stands for the Laplacian of Y ` D2Y = D.DY = Lap(Y) `给定一组n维的M向量（称为b_k），找到一组 `系数（称之为k），它可以产生对恒等式的最佳估计 `矩阵与零向量 ` `M `（1）min | | E-I | |，其中E=和a|k b|k b|k `a_k=1 ` `M `（2）min | | z-0 | |，其中z=和a|k b|k `a_k=1 ` ` `注意，基向量{b_k}不是必需的 `标准化、正交、甚至线性独立。 ` `首先，定义以下数量： ` `B==>列为B_k的矩阵 `G=B'.B==>B乘以B的转置 `F=G@G==>@表示哈达玛产品 `v=diag（G）==>由G的diag元素组成的向量 ` `上述最小化等价于这个线性约束问题 ` `求解F.a=v `s.t.G.a=0 ` `设{X}表示X的Moore-Penrose逆。 `然后线性问题的解可以写成： ` `N=I-{G}.G==>投射到G的零空间 `a=N。{F.N}。v ` `这些系数的用途是允许您编写 `张量场导数的非常简单的表达式。 ` ` `设D为del（或nabla）算子 `d是中心（也称为第0个）节点的差分运算符， `对于任何标量/向量/张量Y，我们有： `dY=Y-Y_0 ` `设x_k为第k个节点的位置向量。 `对于我们的基向量，取 `b_k=dx_k=x_k-x_0。 ` `假设每个节点都有一个与之关联的字段值 `（例如温度），并假设一个二次模型[大约x=x_0] `对于场[g=梯度，H=黑森，“：”是双点积] ` `Y=Y_0+（x-x_0）。g+（x-x_0）（x-x_0）：H/2 `dY=dx.g+dx:H/2 `D2Y=I:H=>Y的拉普拉斯算子 ` ` `在第k个节点评估模型 ` `dY_k=dx_k.g+dx_k dx_k:H/2 ` `乘以a_k并求和 ` `嗯 `总和a_k dY_k=总和a_k dx_k.g+总和a_k dx_k dx_k:H/2 `k=1K=1K=1 ` `=0.g+I:H/2 `=D2Y/2 ` `因此，我们得到了拉普拉斯函数的二阶估计 ` `M `圈数（Y_0）=和2 a_k dY_k `k=1 ` ` `现在用线性模型玩同样的游戏 `dY_k=dx_k.g ` `但这次乘以（a_kdx_k）和和 ` `M M `求和a_k dx_k dY_k=求和a_k dx_k dx_k.g `k=1 k=1 ` `=I.g `=g ` ` `通常，中心节点处的导数可以估计为 ` `M `D#Y=总和a#k dx#k#dY#k `k=1 ` `M `D2Y=和2 a_k dY_k `k=1 ` `在哪里 `#表示{点、叉或张量}积 `产生Y的{div，curl，或grad}的 `及 `D2Y代表Y的拉普拉斯算子 `D2Y=D.DY=Y圈

我会迟交我的两分钱。在空间分布均匀且较大的情况下，通常只提取每个粒子的局部信息

您可能会注意到，提取本地信息的方法有多种：

N最近邻，使用KD t