Python 如何从数据帧执行T测试?

Python 如何从数据帧执行T测试?,python,pandas,p-value,t-test,Python,Pandas,P Value,T Test,我想对男女员工的小时工资平均值进行t检验 `df1 = df[["gender","hourly_wage"]] #creating a sub-dataframe with only the columns of gender and hourly wage staff_wages=df1.groupby(['gender']).mean() #grouping the data frame by gender and assigning it to a new variable 'staff

我想对男女员工的小时工资平均值进行t检验

`df1 = df[["gender","hourly_wage"]] #creating a sub-dataframe with only the columns of gender and hourly wage
staff_wages=df1.groupby(['gender']).mean() #grouping the data frame by gender and assigning it to a new variable 'staff_wages'
staff_wages.head()`
事实是,我想我已经糊涂了一半。我想做一个t检验,所以我写了代码

`mean_val_salary_female = df1[staff_wages['gender'] == 'female'].mean()
mean_val_salary_female = df1[staff_wages['gender'] == 'male'].mean()

t_val, p_val = stats.ttest_ind(mean_val_salary_female, mean_val_salary_male)

# obtain a one-tail p-value
p_val /= 2

print(f"t-value: {t_val}, p-value: {p_val}")`
它只会返回错误

我有点疯狂尝试不同的东西

`#married_vs_dependents = df[['married', 'num_dependents', 'years_in_employment']]


#married_vs_dependents = df[['married', 'num_dependents', 'years_in_employment']]
#married_vs_dependents.head()

#my_data = df(married_vs_dependents)
#my_data.groupby('married').mean()

mean_gender = df.groupby("gender")["hourly_wage"].mean()
married_vs_dependents.head()

mean_gender.groupby('gender').mean()

mean_val_salary_female = df[staff_wages['gender'] == 'female'].mean()
mean_val_salary_female = df[staff_wages['gender'] == 'male'].mean()

#cat1 = mean_gender['male']==['cat1']
#cat2 = mean_gender['female']==['cat2']

ttest_ind(cat1['gender'], cat2['hourly_wage'])`
请问谁能指导我采取正确的步骤?

您将每组的平均值作为
a
b
参数传递-这就是错误产生的原因。相反,您应该传递数组,如中所述


df1=df[[“性别”,“小时工资”]]
m=df1.loc[df1[“性别”].eq(“男性”)][“小时工资”].to_numpy()
f=df1.loc[df1[“性别”].eq(“女性”)][“小时工资”].to_numpy()
统计数据测试索引(m,f)

在这一行中,你的平均工资男性来自哪里(t_val,p_val=stats.ttest_ind(平均工资女性,平均工资男性)??你有两个(平均工资女性),但没有(平均工资男性)@MehdiHamzeloee哦,我明白了。让我看看