python中的装箱数据(散点图)?
我得到了体积(x轴)和价格(dMidP,y轴)散点图的散点图,我想把x轴分成30个等距的部分,平均值,然后绘制平均值 以下是我的数据: 此处的代码不会返回所需的绘图:python中的装箱数据(散点图)?,python,Python,我得到了体积(x轴)和价格(dMidP,y轴)散点图的散点图,我想把x轴分成30个等距的部分,平均值,然后绘制平均值 以下是我的数据: 此处的代码不会返回所需的绘图: V_norm = Average_Buy['Volume_norm'] df = pd.DataFrame({'X' : np.log(Average_Buy['Volume_norm']), 'Y' : Average_Buy['dMidP']}) #we build a dataframe from the data to
V_norm = Average_Buy['Volume_norm']
df = pd.DataFrame({'X' : np.log(Average_Buy['Volume_norm']), 'Y' : Average_Buy['dMidP']}) #we build a dataframe from the data
total_bins = 30
bins = np.geomspace(V_norm.min(), V_norm.max(), total_bins)
data_cut = pd.cut(df.X,bins)
grp = df.groupby(by = data_cut) #we group the data by the cut
ret = grp.aggregate(np.mean) #we produce an aggregate representation (median) of each bin
plt.loglog(np.log(Average_Buy['Volume_norm']),Average_Buy['dMidP'],'o')
plt.loglog(ret.X,ret.Y,'r-')
plt.show()
以下是我得到的:
我的垃圾箱返回我:(看起来正确)
但是,我的数据切割返回给我:
Time Time
11 0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
10 (0.991, 1.081]
11 NaN
12 NaN
13 NaN
14 NaN
15 NaN
16 NaN
17 NaN
18 NaN
19 NaN
20 NaN
21 NaN
22 NaN
23 NaN
24 NaN
25 NaN
26 NaN
27 NaN
28 NaN
29 NaN
...
14 30 NaN
31 NaN
32 NaN
33 NaN
34 NaN
35 NaN
36 NaN
37 NaN
38 NaN
39 NaN
40 NaN
41 NaN
42 NaN
43 NaN
44 NaN
45 NaN
46 NaN
47 NaN
48 NaN
49 NaN
50 NaN
51 NaN
52 NaN
53 NaN
54 NaN
55 NaN
56 NaN
57 NaN
58 NaN
59 NaN
您的
bin
变量不是您想要的。您可以将存储箱
从日志空间向后转换为线性空间,或者从get-go获得具有日志间距的线性空间中的存储箱:
bins = np.geomspace(Volume.min(), Volume.max(), total_bins)
编辑:将
np.logspace
更改为np.geomspace
的可能副本我不是试图绘制一条最佳拟合的线性线,而是平均散点图,然后连接点以构建一条线,但是当我把这个代码包含在total_bins=100中时,我得到了一个错误,说Bin边必须是唯一的,我把我的答案从np.logspace
改为np.geomspace
(start
和stop
在np.logspace
中不是我认为的;np.geomspace
做了直观的事情)。如果问题仍然存在,请发布存储箱的值(以及最小/最大容量)。图表会发生变化,但看起来也不正确。bin:array([4.50996122e-03,1.79450189e-02,7.14027653e-02,2.84109754e-01,1.13046535e+00,4.49809235e+00,1.78977929e+01,7.12148546e+01,2.83362062e+02,1.12749030e+03]);最小体积=0.004509961215828188;Volume max=1127(因此范围是正确的),但请查看问题的更新,包括此代码的问题Hi Paul,如果开始为负,则np.geomspace不起作用(geomspace(np.log(Volume.min()),np.log(Volume.max()),total_bins))
bins = np.geomspace(Volume.min(), Volume.max(), total_bins)