Python 大熊猫的直方图处理_Python_Pandas_Histogram

Python 大熊猫的直方图处理

python pandas

Python 大熊猫的直方图处理,python,pandas,histogram,Python,Pandas,Histogram,我在pandas中有一个大数据帧。我想在绘制直方图时删除频率较低的特定范围的值（不是单个值）对于下图，假设我想删除数据帧变量的所有值，这些值对应于低于20的计数/频率。有人有什么解决办法吗 # PR has value between 0 to 1700 data['PR'].hist(bins = 160) #image on the left data_openforest['PR'].hist(bins = 160) #image on the right 像这样使用pd.cut应

我在

pandas

中有一个大数据帧。我想在绘制直方图时删除频率较低的特定范围的值（不是单个值）

对于下图，假设我想删除数据帧变量的所有值，这些值对应于低于20的计数/频率。有人有什么解决办法吗

# PR has value between 0 to 1700 
data['PR'].hist(bins = 160) #image on the left
data_openforest['PR'].hist(bins = 160) #image on the right

像这样使用pd.cut应该可以：

out = pd.cut(data_openforest['PR'], bins=160)
counts = out.value_counts(sort=False)
counts[counts > 20].plot.bar()
plt.show()

如果要筛选数据帧，必须执行以下操作：

data_openforest['bin'] = pd.cut(data_openforest['PR'], bins=160)
bin_freq = data_openforest.groupby('bin').count()
data_openforest = data_openforest.merge(bin_freq, 
                                        on='bin', 
                                        how='left',
                                        suffixes=("_bin", 
                                                  "_bin_freq"))

然后，您可以轻松地过滤数据帧。然后你必须做一个条形图，而不是历史图。

你可以使用

np.histogram

或

pd.cut

来计算柱状图并过滤计数。你有这样的例子吗？我得到了

ranges=[I for I in np.arange（01600,10）]

和

count=data\u openforest.groupby（pd.cut（data\u openforest['count']，ranges））.count（）

。但我现在如何将其应用于我的原始数据帧。这可能会帮助您：是的，我的代码也有一定的范围。但我必须手动查看并进行过滤。您能告诉我如何使用您提到的代码直接过滤数据帧而无需干预吗？已编辑。不知道你说的“不干预”是什么意思。是的。谢谢你，这正是你刚才所做的。我不知道最后一步该怎么做。