XY数据的逐点箱子编号和箱子高度(使用Python)
在XY分布的2D柱状图中,如何知道每个点对应的箱子编号和箱子高度XY数据的逐点箱子编号和箱子高度(使用Python),python,numpy,matplotlib,seaborn,Python,Numpy,Matplotlib,Seaborn,在XY分布的2D柱状图中,如何知道每个点对应的箱子编号和箱子高度 如何正确地可视化结果(最好是使用seaborn)?因此,我想创建一个图,其中我的x,y数据点将与使用numpy.historogram2d计算的直方图叠加 import numpy as np import matplotlib.pyplot as plt import seaborn as sns np.random.seed(9) x = np.round(10*np.random.rand(12), 1) y = np.
如何正确地可视化结果(最好是使用
seaborn
)?因此,我想创建一个图,其中我的x,y数据点将与使用numpy.historogram2d
计算的直方图叠加
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
np.random.seed(9)
x = np.round(10*np.random.rand(12), 1)
y = np.round(10*np.random.rand(12), 1)
binrange=([x.min(), x.max()+1], [y.min(), y.max()+1]
h, ex, ey = np.histogram2d(x, y, bins=5, range=binrange), density=False)
nx = np.digitize(x, bins=ex)
ny = np.digitize(y, bins=ey)
print('Why do my points fall into empty bins??')
print('Values:', '\n', x, '\n', y, '\n')
print('Bins', '\n', ex, '\n', ey, '\n')
print('Bin numbers:\n', nx, '\n', ny, '\n')
sns.histplot(x=x, y=y, bins=5, binrange=binrange), cbar=True)
sns.scatterplot(x=x, y=y, s=15, color='k')
plt.suptitle('What I expect to see')
输出:
Values:
[0.1 5. 5. 1.3 1.4 2.2 4.2 2.5 0.8 3.5 1.7 8.8]
[9.5 0.4 7. 5.7 9. 6.7 5.5 7. 3.9 6.9 8.2 4.7]
Bins
[0.1 2.04 3.98 5.92 7.86 9.8 ]
[ 0.4 2.42 4.44 6.46 8.48 10.5 ]
Bin numbers:
[1 3 3 1 1 2 3 2 1 2 1 5]
[5 1 4 3 5 4 3 4 2 4 4 3]
这里的一个小技巧是使用np.rot90
正确旋转计算出的直方图:
plt.imshow(np.rot90(h, 1),
extent=[x.min(), x.max()+1, y.min(), y.max()+1], origin='upper', cmap='Blues')
plt.colorbar()
plt.scatter(x=x, y=y, s=10, color='k')
这样,问题几乎就解决了。但是,使用sns.heatmap
绘制最后一个绘图需要更多的时间。这里的主要问题是如何设置轴的范围。或者,我们可以将原始数据缩放到极限值(0,单元数量)
例如:
def transform(distrA, limitsA, limitsB):
'''Transforms distribution of unevenly distributed points in a space A to space B"
Input:
distrA - numpy 2D array [[arrdim1 ...], [arrdim2 ...], [arrdim3 ...], [arrdim4 ...]] -
Distribution to be transformed.
limitsA and limitsB - (array of pairs) -
Limits of space A and B, correspondingly, in the form (lower, higher)
Output:
distrB - transformed distribution'''
shape=distrA.shape
distrB = np.empty(shape=distrA.shape)
for i in range(shape[0]):
spanA = limitsA[i][1] - limitsA[i][0]
spanB = limitsB[i][1] - limitsB[i][0]
for j in range(shape[1]):
distrB[i, j] = spanB * (distrA[i, j]-limitsA[i][0]) / spanA + limitsB[i][0]
return distrB
hm=sns.heatmap(np.rot90(h, 1), cmap='Blues', annot=True)
h_trans=transform(np.asarray([x, y]),
[[x.min(), x.max()+1], [y.min(), y.max()+1]],
((0,5), (5,0))
)
sns.scatterplot(x=h_trans[0], y=h_trans[1], s=20, color='k')
plt.title('Desired seaborn heatmap')
因此,我想创建一个图,其中我的x,y数据点将与使用
numpy.historogram2d
计算的直方图叠加
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
np.random.seed(9)
x = np.round(10*np.random.rand(12), 1)
y = np.round(10*np.random.rand(12), 1)
binrange=([x.min(), x.max()+1], [y.min(), y.max()+1]
h, ex, ey = np.histogram2d(x, y, bins=5, range=binrange), density=False)
nx = np.digitize(x, bins=ex)
ny = np.digitize(y, bins=ey)
print('Why do my points fall into empty bins??')
print('Values:', '\n', x, '\n', y, '\n')
print('Bins', '\n', ex, '\n', ey, '\n')
print('Bin numbers:\n', nx, '\n', ny, '\n')
sns.histplot(x=x, y=y, bins=5, binrange=binrange), cbar=True)
sns.scatterplot(x=x, y=y, s=15, color='k')
plt.suptitle('What I expect to see')
输出:
Values:
[0.1 5. 5. 1.3 1.4 2.2 4.2 2.5 0.8 3.5 1.7 8.8]
[9.5 0.4 7. 5.7 9. 6.7 5.5 7. 3.9 6.9 8.2 4.7]
Bins
[0.1 2.04 3.98 5.92 7.86 9.8 ]
[ 0.4 2.42 4.44 6.46 8.48 10.5 ]
Bin numbers:
[1 3 3 1 1 2 3 2 1 2 1 5]
[5 1 4 3 5 4 3 4 2 4 4 3]
这里的一个小技巧是使用np.rot90
正确旋转计算出的直方图:
plt.imshow(np.rot90(h, 1),
extent=[x.min(), x.max()+1, y.min(), y.max()+1], origin='upper', cmap='Blues')
plt.colorbar()
plt.scatter(x=x, y=y, s=10, color='k')
这样,问题几乎就解决了。但是,使用sns.heatmap
绘制最后一个绘图需要更多的时间。这里的主要问题是如何设置轴的范围。或者,我们可以将原始数据缩放到极限值(0,单元数量)
例如:
def transform(distrA, limitsA, limitsB):
'''Transforms distribution of unevenly distributed points in a space A to space B"
Input:
distrA - numpy 2D array [[arrdim1 ...], [arrdim2 ...], [arrdim3 ...], [arrdim4 ...]] -
Distribution to be transformed.
limitsA and limitsB - (array of pairs) -
Limits of space A and B, correspondingly, in the form (lower, higher)
Output:
distrB - transformed distribution'''
shape=distrA.shape
distrB = np.empty(shape=distrA.shape)
for i in range(shape[0]):
spanA = limitsA[i][1] - limitsA[i][0]
spanB = limitsB[i][1] - limitsB[i][0]
for j in range(shape[1]):
distrB[i, j] = spanB * (distrA[i, j]-limitsA[i][0]) / spanA + limitsB[i][0]
return distrB
hm=sns.heatmap(np.rot90(h, 1), cmap='Blues', annot=True)
h_trans=transform(np.asarray([x, y]),
[[x.min(), x.max()+1], [y.min(), y.max()+1]],
((0,5), (5,0))
)
sns.scatterplot(x=h_trans[0], y=h_trans[1], s=20, color='k')
plt.title('Desired seaborn heatmap')
请注意,这里的一些解决方案和描述可能并不理想。欢迎对代码优化提出任何建议!:)当然,在最后一个图中,可以使用plt.ticks()设置适当的刻度和标签。注意,这里的一些解决方案和描述可能不是理想的pythonic。欢迎对代码优化提出任何建议!:)当然,在最后一次绘图时,可以使用plt.ticks()设置正确的刻度和标签。