python中的装箱数据(散点图)?

python中的装箱数据(散点图)?,python,Python,我得到了体积(x轴)和价格(dMidP,y轴)散点图的散点图,我想把x轴分成30个等距的部分,平均值,然后绘制平均值 以下是我的数据: 此处的代码不会返回所需的绘图: V_norm = Average_Buy['Volume_norm'] df = pd.DataFrame({'X' : np.log(Average_Buy['Volume_norm']), 'Y' : Average_Buy['dMidP']}) #we build a dataframe from the data to

我得到了体积(x轴)和价格(dMidP,y轴)散点图的散点图,我想把x轴分成30个等距的部分,平均值,然后绘制平均值

以下是我的数据:

此处的代码不会返回所需的绘图:

V_norm = Average_Buy['Volume_norm']
df = pd.DataFrame({'X' : np.log(Average_Buy['Volume_norm']), 'Y' : Average_Buy['dMidP']})  #we build a dataframe from the data
total_bins = 30
bins = np.geomspace(V_norm.min(), V_norm.max(), total_bins)
data_cut = pd.cut(df.X,bins)         
grp = df.groupby(by = data_cut)        #we group the data by the cut
ret = grp.aggregate(np.mean)         #we produce an aggregate representation (median) of each bin
plt.loglog(np.log(Average_Buy['Volume_norm']),Average_Buy['dMidP'],'o')
plt.loglog(ret.X,ret.Y,'r-')

plt.show()
以下是我得到的:

我的垃圾箱返回我:(看起来正确)

但是,我的数据切割返回给我:

Time  Time
11    0                  NaN
      1                  NaN
      2                  NaN
      3                  NaN
      4                  NaN
      5                  NaN
      6                  NaN
      7                  NaN
      8                  NaN
      9                  NaN
      10      (0.991, 1.081]
      11                 NaN
      12                 NaN
      13                 NaN
      14                 NaN
      15                 NaN
      16                 NaN
      17                 NaN
      18                 NaN
      19                 NaN
      20                 NaN
      21                 NaN
      22                 NaN
      23                 NaN
      24                 NaN
      25                 NaN
      26                 NaN
      27                 NaN
      28                 NaN
      29                 NaN
                   ...      
14    30                 NaN
      31                 NaN
      32                 NaN
      33                 NaN
      34                 NaN
      35                 NaN
      36                 NaN
      37                 NaN
      38                 NaN
      39                 NaN
      40                 NaN
      41                 NaN
      42                 NaN
      43                 NaN
      44                 NaN
      45                 NaN
      46                 NaN
      47                 NaN
      48                 NaN
      49                 NaN
      50                 NaN
      51                 NaN
      52                 NaN
      53                 NaN
      54                 NaN
      55                 NaN
      56                 NaN
      57                 NaN
      58                 NaN
      59                 NaN

您的
bin
变量不是您想要的。您可以将
存储箱
从日志空间向后转换为线性空间,或者从get-go获得具有日志间距的线性空间中的存储箱:

bins = np.geomspace(Volume.min(), Volume.max(), total_bins)

编辑:将
np.logspace
更改为
np.geomspace
的可能副本我不是试图绘制一条最佳拟合的线性线,而是平均散点图,然后连接点以构建一条线,但是当我把这个代码包含在total_bins=100中时,我得到了一个错误,说Bin边必须是唯一的,我把我的答案从
np.logspace
改为
np.geomspace
start
stop
np.logspace
中不是我认为的;
np.geomspace
做了直观的事情)。如果问题仍然存在,请发布
存储箱的值(以及最小/最大容量)。图表会发生变化,但看起来也不正确。bin:array([4.50996122e-03,1.79450189e-02,7.14027653e-02,2.84109754e-01,1.13046535e+00,4.49809235e+00,1.78977929e+01,7.12148546e+01,2.83362062e+02,1.12749030e+03]);最小体积=0.004509961215828188;Volume max=1127(因此范围是正确的),但请查看问题的更新,包括此代码的问题Hi Paul,如果开始为负,则np.geomspace不起作用(geomspace(np.log(Volume.min()),np.log(Volume.max()),total_bins))
bins = np.geomspace(Volume.min(), Volume.max(), total_bins)