为两个样本t测试在R上创建一个带有字符变量的循环
我希望在R中做多个双样本t检验。 我想测试50个有两个级别的指标。所以一开始我用:为两个样本t测试在R上创建一个带有字符变量的循环,r,loops,rstudio,R,Loops,Rstudio,我希望在R中做多个双样本t检验。 我想测试50个有两个级别的指标。所以一开始我用: t.test(m~f) Welch Two Sample t-test data: m by f t = 2.5733, df = 174.416, p-value = 0.01091 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:
t.test(m~f)
Welch Two Sample t-test
data: m by f
t = 2.5733, df = 174.416, p-value = 0.01091
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.05787966 0.43891600
sample estimates:
mean in group FSS mean in group NON-FSS
0.8344209 0.5860231
这里m对应于我要测试的第一个指标m=债务对权益比率。
以下是我需要测试的所有指标的列表:
print (indicators)
[1] "Debt.to.equity.ratio" "Deposits.to.loans"
[3] "Deposits.to.total.assets" "Gross.loan.portfolio.to.total.assets"
[5] "Number.of.active.borrowers" "Percent.of.women.borrowers"
[7] "Number.of.loans.outstanding" "Gross.loan.portfolio"
[9] "Average.loan.balance.per.borrower" "Average.loan.balance.per.borrower...GNI.per.capita"
[11] "Average.outstanding.balance" "Average.outstanding.balance...GNI.per.capita"
[13] "Number.of.depositors" "Number.of.deposit.accounts"
[15] "Deposits" "Average.deposit.balance.per.depositor"
[17] "Average.deposit.balance.per.depositor...GNI.per.capita" "Average.deposit.account.balance"
[19] "Average.deposit.account.balance...GNI.per.capita" "Return.on.assets"
[21] "Return.on.equity" "Operational.self.sufficiency"
[23] "FSS" "Financial.revenue..assets"
[25] "Profit.margin" "Yield.on.gross.portfolio..nominal."
[27] "Yield.on.gross.portfolio..real." "Total.expense..assets"
[29] "Financial.expense..assets" "Provision.for.loan.impairment..assets"
[31] "Operating.expense..assets" "Personnel.expense..assets"
[33] "Administrative.expense..assets" "Operating.expense..loan.portfolio"
[35] "Personnel.expense..loan.portfolio" "Average.salary..GNI.per.capita"
[37] "Cost.per.borrower" "Cost.per.loan"
[39] "Borrowers.per.staff.member" "Loans.per.staff.member"
[41] "Borrowers.per.loan.officer" "Loans.per.loan.officer"
[43] "Depositors.per.staff.member" "Deposit.accounts.per.staff.member"
[45] "Personnel.allocation.ratio" "Portfolio.at.risk...30.days"
[47] "Portfolio.at.risk...90.days" "Write.off.ratio"
[49] "Loan.loss.rate" "Risk.coverage"
我不想每次在t.test中都更改指示符名称,而是想创建一个循环来自动执行并计算p.value。我尝试创建一个循环,但由于变量=字符的性质,无法使其工作
我真的很感激任何关于如何前进的提示!
多谢各位
最好的
摩根(Morgan)我假设你正在对同一个f进行每个指标的回归 在这种情况下,您可以尝试以下方法:
p_vals = NULL;
for(this_indicator in indicators)
{
this_formula = paste(c(this_indicator, "f"), collapse="~");
res = t.test(as.formula(this_formula));
p_vals = c(p_vals, res$p.value);
}
然而,有一条评论:您是否对这些p值进行了多重性调整?考虑到您正在进行的大量测试,很可能会出现大量误报。嗨,Gvrocha,感谢您的快速回答,我刚刚开始使用R。我已经尝试了该代码,但它似乎仍然不起作用。是的,我使用的是bonferroni调整,但我的主要问题是:我有一个包含所有这些不同指标的大型数据集,其中一列:这里称为FSS,我以前将其重命名为f,有两个级别:非FSS和FSS。我想对两组的所有指标进行两个样本测试。但是,我想创建一个循环,通过所有不同的指标来测试这两个级别之间的差异是否显著!非常感谢你!我试图测试每个指标是否与FSS和非FSS变量X存在显著差异,并对代码进行如下修改:>p_vals=NULL;>数据中的ForCapital.asset.ratio+{+this_公式=pastecCapita.asset.ratio~X;+res=t.testas.formula公式;+p_vals=cp_vals,res$p.value;+}但我仍然无法测试每个变量,它只测试资本、资产、比率。。。