Python 非常大矩阵的热图,包括NAN

Python 非常大矩阵的热图,包括NAN,python,heatmap,large-data,Python,Heatmap,Large Data,我试图看看南部是否集中在某个地方,或者它们的分布是否有任何模式 其思想是使用python绘制矩阵的热图(即200K行和1k列),并为NaN值设置一种特殊颜色(其余值可以用相同的颜色表示,这无关紧要) 可能的显示示例: 提前谢谢大家 # Learn about API authentication here: https://plot.ly/python/getting-started # Find your api_key here: https://plot.ly/settings/api

我试图看看南部是否集中在某个地方,或者它们的分布是否有任何模式

其思想是使用python绘制矩阵的热图(即200K行和1k列),并为NaN值设置一种特殊颜色(其余值可以用相同的颜色表示,这无关紧要)

可能的显示示例:

提前谢谢大家

# Learn about API authentication here: https://plot.ly/python/getting-started
# Find your api_key here: https://plot.ly/settings/api

import plotly.plotly as py
import plotly.graph_objs as go

data = [
    go.Heatmap(
        z=[[1, 20, 30],
        [20, 1, 60],
        [30, 60, 1]]
    )
]
plot_url = py.plot(data, filename='basic-heatm

原因:

您可以使用散点图:

import matplotlib.pyplot as plt
import numpy as np
# create a matrix with random numbers
A = np.random.rand(2000,10)
# make some NaNs in it:
for _ in range(1000):
    i = np.random.randint(0,2000)
    j = np.random.randint(0,10)
    A[i,j] = np.nan
# get a matrix to plot with only the NaNs:
B = np.isnan(A)
# if NaN plot a point. 
for i in range(2000):
    for j in range(10):
        if B[i,j]: plt.scatter(i,j)
plt.show()

使用Python 2.6或2.7时,请考虑使用XRead代替加速范围。

注意。这样做可能会更快:

C = np.where(B)
plt.scatter(C[0],C[1])
1:200的纵横比非常糟糕,因为您可能会遇到内存问题,所以您可能应该将其分成几个Nx1k部分

话虽如此,以下是我的解决方案(受您的示例图像启发):

下面是它的外观:


谢谢,但我不知道,我已经知道这个简单的解决方案,我想找到一种方法来绘制一个非常大的对象(使用NaN),正如我在帖子中所解释的,我认为可能值得一看。不确定它是否能满足你的需要,但是…天哪,我现在才意识到这个问题已经有半年了。。。好吧,我希望答案对某人仍然有用:D
from mpl_toolkits.axes_grid1 import AxesGrid

# generate random matrix
xDim = 2000
yDim = 4000
# number of nans
nNans = xDim*yDim*.1
rands = np.random.rand(yDim, xDim)

# create a skewed distribution for the nans
x = np.clip(np.random.gamma(2, yDim*.125, size=nNans).astype(np.int),0 ,yDim-1)
y = np.random.randint(0,xDim,size=nNans)
rands[x,y] = np.nan

# find the nans:
isNan = np.isnan(rands)

fig = plt.figure()

# make axesgrid so we can put a histogram-like plot next to the data
grid = AxesGrid(fig, 111, nrows_ncols=(1, 2), axes_pad=0.05)

# plot the data using binary colormap
grid[0].imshow(isNan, cmap=cm.binary)

# plot the histogram
grid[1].plot(np.sum(isNan,axis=1), range(isNan.shape[0]))

# set ticks and limits, so the figure looks nice
grid[0].set_xticks([0,250,500,750,1000,1250,1500,1750])
grid[1].set_xticks([0,250,500,750])
grid[1].set_xlim([0,750])
grid.axes_llc.set_ylim([0, yDim])
plt.show()