Pandas groupby 创建图表，显示按特定年龄划分的平均医疗费用_Pandas Groupby_Scatter

Pandas groupby 创建图表，显示按特定年龄划分的平均医疗费用

Pandas groupby 创建图表，显示按特定年龄划分的平均医疗费用,pandas-groupby,scatter,Pandas Groupby,Scatter,我正在使用包含以下字段的匿名医疗信息的数据集：年龄（年）性别（男/女）体重指数子女人数吸烟者（是/否）区域医疗费用我绘制了一个直方图来检查数据集的年龄分布，并绘制了一个散点图来比较每一行的年龄和医疗费用之间的关系。我现在想做的是绘制一个图，显示每个年龄组中每个人的平均费用。我试图回答的问题是“平均而言，每个年龄段的个人收费是多少？” 我希望这是足够的信息和背景以下是第一个散点图的代码（按预期工作，无错误）：下面是我一直在尝试做的（底部的错误）：尝试此策略和类似策略时，我

我正在使用包含以下字段的匿名医疗信息的数据集：

年龄（年）
性别（男/女）
体重指数
子女人数
吸烟者（是/否）
区域
医疗费用

我绘制了一个直方图来检查数据集的年龄分布，并绘制了一个散点图来比较每一行的年龄和医疗费用之间的关系。我现在想做的是绘制一个图，显示每个年龄组中每个人的平均费用。我试图回答的问题是“平均而言，每个年龄段的个人收费是多少？”

我希望这是足够的信息和背景

以下是第一个散点图的代码（按预期工作，无错误）：

下面是我一直在尝试做的（底部的错误）：

尝试此策略和类似策略时，我得到了以下相关系数计算错误： ValueError：串联轴的所有输入数组维度必须完全匹配，但沿维度1，索引0处的数组大小为1，索引1处的数组大小为47

感谢您，善良的灵魂们，提前感谢您的帮助和见解。

之后缺少（）。在以下代码中是唯一的：

#将相关系数四舍五入至小数点后2位四舍五入的_corr=“{.2f}”格式（np.corrcoef（df.age.唯一，按_age.mean（）收费）[0,1]）

修正后，该代码有效，因此： #将相关系数四舍五入至小数点后2位四舍五入的格式（np.corrcoef（df.age.unique（），按年龄平均值（）收费）[0,1]）

#How are age and medical costs related?
fig, ax = plt.subplots()
ax.set_xlabel("age")
ax.set_ylabel("medical charges")
plt.title("medical charges vs age")
plt.scatter(df.age, df.charges, alpha=0.2, color='purple')
plt.show()

#round correlation coefficient to 2 decimal places
rounded_corr = "{:.2f}".format(np.corrcoef(df.age, df.charges)[0,1])

#print  the results
print("The correlation coefficient between age and charges is " + str(rounded_corr)
       + " suggesting positive but not very strong correlation.")

#How are age and medical costs related?
charges_by_age = df.charges.groupby(df.age)

fig, ax = plt.subplots()
ax.set_xlabel("age group")
ax.set_ylabel("average medical charges")
plt.title("average medical charges vs age")
plt.scatter(df.age.unique(), charges_by_age.mean(), alpha=0.2, color='blue')
plt.show()

#round correlation coefficient to 2 decimal places
rounded_corr = "{:.2f}".format(np.corrcoef(df.age.unique, charges_by_age.mean())[0,1])

#print  the results
print("The correlation coefficient between age and average charges is " + str(rounded_corr)
+ " suggesting positive but not very strong correlation.")