Python seaborn点图可视化_Python_Matplotlib_Machine Learning_Seaborn_Data Visualization

Python seaborn点图可视化

python matplotlib machine-learning

Python seaborn点图可视化,python,matplotlib,machine-learning,seaborn,data-visualization,Python,Matplotlib,Machine Learning,Seaborn,Data Visualization,我正在绘制一个点图来显示“工人阶级”、“性别”、“职业”和“收入是否超过50K”之间的关系。然而，结果却是一团糟。图例是粘在一起的，图例中的女性和男性都以蓝色显示 #Co-relate categorical features grid = sns.FacetGrid(train, row='occupation', size=6, aspect=1.6) grid.map(sns.pointplot, 'workclass', 'exceeds50K', 'sex', palette='dee

我正在绘制一个点图来显示“工人阶级”、“性别”、“职业”和“收入是否超过50K”之间的关系。然而，结果却是一团糟。图例是粘在一起的，图例中的女性和男性都以蓝色显示

#Co-relate categorical features
grid = sns.FacetGrid(train, row='occupation', size=6, aspect=1.6)
grid.map(sns.pointplot, 'workclass', 'exceeds50K', 'sex', palette='deep', markers = ["o", "x"] )
grid.add_legend()

请告知如何适合地块的大小。谢谢

听起来“exceeds50k”是一个分类变量。对于点图，y变量需要是连续的。假设这是您的数据集：

import pandas as pd
import seaborn as sns
df =pd.read_csv("https://raw.githubusercontent.com/katreparitosh/Income-Predictor-Model/master/Database/adult.csv")

我们简化了一些类别以进行绘图，例如：

df['native.country'] = [i if i == 'United-States' else 'others' for i in df['native.country']  ]
df['race'] = [i if i == 'White' else 'others' for i in df['race']  ]

df.head()

    age workclass   fnlwgt  education   education.num   marital.status  occupation  relationship    race    sex capital.gain    capital.loss    hours.per.week  native.country  income
0   90  ?   77053   HS-grad 9   Widowed ?   Not-in-family   White   Female  0   4356    40  United-States   <=50K
1   82  Private 132870  HS-grad 9   Widowed Exec-managerial Not-in-family   White   Female  0   4356    18  United

如果它是连续的，例如年龄，您可以看到它工作：

grid = sns.FacetGrid(df, row='race', height=3, aspect=1.6)
grid.map(sns.pointplot, 'native.country', 'age', 'sex', palette='deep', markers = ["o", "x"] )
grid.add_legend()

grid = sns.FacetGrid(df, row='race', height=3, aspect=1.6)
grid.map(sns.pointplot, 'native.country', 'age', 'sex', palette='deep', markers = ["o", "x"] )
grid.add_legend()