R 如何将蒙特卡罗模拟应用于回归？_R_Regression_Linear Regression_Montecarlo

R 如何将蒙特卡罗模拟应用于回归？

R 如何将蒙特卡罗模拟应用于回归？,r,regression,linear-regression,montecarlo,R,Regression,Linear Regression,Montecarlo,我正在从事一个数据分析项目，该项目要求我使用标准线性回归的蒙特卡罗模拟。在本项目中，我们希望改变的变量是模拟数据的影响大小和样本大小。不过，我在运行蒙特卡罗模拟方面还是新手，可能需要一些帮助来排除代码故障所以为了理解这个问题，我首先利用MonteCarlo包附带的Vignette来理解包是如何工作的据我所知，Monte Carlo包运行一个函数，您定义了多次指定，然后返回用户定义的输出。我试图生成一个函数来创建MonteCarlo（）函数所需的元素：生成数据的方法、统计方法、决策和结果。模型

我正在从事一个数据分析项目，该项目要求我使用标准线性回归的蒙特卡罗模拟。在本项目中，我们希望改变的变量是模拟数据的影响大小和样本大小。不过，我在运行蒙特卡罗模拟方面还是新手，可能需要一些帮助来排除代码故障

所以为了理解这个问题，我首先利用MonteCarlo包附带的Vignette来理解包是如何工作的

据我所知，Monte Carlo包运行一个函数，您定义了多次指定，然后返回用户定义的输出。我试图生成一个函数来创建MonteCarlo（）函数所需的元素：生成数据的方法、统计方法、决策和结果。模型完成率达到23%，而不是崩溃。下面是我从web上的不同来源拼凑而成的代码

####Creating a function to run regressions in a Monte Carlo simulation.
###Generating function to pull p-values out of regression models:
lmp <- function (modelobject) {
  if (class(modelobject) != "lm") stop("Not an object of class 'lm' ")
  f <- summary(modelobject)$fstatistic
  p <- pf(f[1],f[2],f[3],lower.tail=F)
  attributes(p) <- NULL
  return(p)
}

###Creating function for regression analysis to be passed to the MonteCarlo() function.
Regression<-function(n, corr){
# create the initial x variable
x1 <- rnorm(n, 15, 5)

# x2, x3, and x4 in a matrix, these will be modified to meet the criteria
x2 <- scale(matrix( rnorm(n), ncol=1 ))

# put all into 1 matrix for simplicity
x12 <- cbind(scale(x1),x2)

# find the current correlation matrix
c1 <- var(x12)

# cholesky decomposition to get independence
chol1 <- solve(chol(c1))

newx <-  x12 %*% chol1 

# create new correlation structure (zeros can be replaced with other r vals)
newc <- matrix( 
  c(1  , corr,  
    corr, 1), ncol=2 )

# check that it is positive definite
chol2 <- chol(newc)

sample <- newx %*% chol2 * sd(x1) + mean(x1)

###Defining test statistic to be calculated on each sample.
stat<-lm(sample[,1]~sample[,2])

# get test decision:
decision<-abs(lmp(stat))<.05

# return result:
return(list("decision"=decision))
}

#define parameter grid for monte carlo simulation:

n_grid<-c(50,100,250,500)
corr_grid<-seq(0,.9,0.05)


# collect parameter grids in list:
param_list=list("n"=n_grid, "corr"=corr_grid)

####Running A Monte Carlo Simulation
library(MonteCarlo)
MC_result<-MonteCarlo(func=Regression, nrep=1000, param_list=param_list)
summary(MC_result)

相反，我得到了下表：

 > ###Creating a LateX Table
> MakeTable(output=MC_result, rows="n", cols="corr", digits=3, include_meta=FALSE)
\begin{table}[h]
\centering
\resizebox{ 1 \textwidth}{!}{%
\begin{tabular}{ rrrrrrrrrrrrrrrrrrrrr }
\hline\hline\\\\
n/corr &  & 0 & 0.05 & 0.1 & 0.15 & 0.2 & 0.25 & 0.3 & 0.35 & 0.4 & 0.45 & 0.5 & 0.55 & 0.6 & 0.65 & 0.7 & 0.75 & 0.8 & 0.85 & 0.9 \\ 
 &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  \\ 
50 &  & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 \\ 
100 &  & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 \\ 
250 &  & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 \\ 
500 &  & 0.000 & 0.000 & 0.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 \\ 
\\
\\
\hline\hline
\end{tabular}%
}
\caption{ decision  }
\end{table}

这张桌子的问题是，它的主题完全令人难以置信。线性回归只产生100%的显著p值或0%的显著p值是毫无意义的。表中应有一个区域，其中相关性和样本量的某些组合导致某些p值显著，而某些p值不显著。也就是说，一些比例应该在0和1之间，而不是等于0和1

如果您想知道我在使用此函数时犯了什么错误，从而导致此结果，我们将不胜感激。

您从未在

回归（）中定义过对象corr
，可能是您的环境中存在其他问题，使得分解newc
变得不可能。我曾尝试使用rm（list=ls（all.names=TRUE））函数清理环境中的所有对象，结果没有任何变化。在定义corr方面，我认为我已经通过将其放在原始的相关矩阵中定义了corr:#创建新的相关结构（零可以替换为其他r VAL）newc哦，是的，对不起，我错过了。您收到的错误消息告诉您一件事：传递给chol
的矩阵不是正定的。a、 k.a.这是无法考虑的。如果corr最终为1，则可能出现这种情况，例如，如果您意外地将同一变量与其自身关联，则可能出现这种情况？我不知道您传递给参数的是什么。尽管我必须诚实地说，你只是在自找麻烦，试图重新发明轮子。如果您想进行置换测试，只需使用重采样范例将anlm
模型复制1000次即可。试试？开机
哦，糟了。由于序列的设置方式，corr_网格包含值1。因此，当函数尝试解析该值时，网格将变为1,1,1,1，这是无法计算的。完美的现在我需要理解的是，为什么生成的表看起来有问题。用新结果更新基本响应。正确，矩阵（c（1,1,1,1），ncol=2）

的结果不可逆您从未在

回归（）中定义对象corr
，可能是您的环境中存在其他问题，使得分解newc
变得不可能。我曾尝试使用rm（list=ls（all.names=TRUE））函数清理环境中的所有对象，结果没有任何变化。在定义corr方面，我认为我已经通过将其放在原始的相关矩阵中定义了corr:#创建新的相关结构（零可以替换为其他r VAL）newc哦，是的，对不起，我错过了。您收到的错误消息告诉您一件事：传递给chol
的矩阵不是正定的。a、 k.a.这是无法考虑的。如果corr最终为1，则可能出现这种情况，例如，如果您意外地将同一变量与其自身关联，则可能出现这种情况？我不知道您传递给参数的是什么。尽管我必须诚实地说，你只是在自找麻烦，试图重新发明轮子。如果您想进行置换测试，只需使用重采样范例将anlm
模型复制1000次即可。试试？开机
哦，糟了。由于序列的设置方式，corr_网格包含值1。因此，当函数尝试解析该值时，网格将变为1,1,1,1，这是无法计算的。完美的现在我需要理解的是，为什么生成的表看起来有问题。用新结果更新基本响应。正确，矩阵（c（1,1,1,1），ncol=2）

的结果是不可逆的

 > ###Creating a LateX Table
> MakeTable(output=MC_result, rows="n", cols="corr", digits=3, include_meta=FALSE)
\begin{table}[h]
\centering
\resizebox{ 1 \textwidth}{!}{%
\begin{tabular}{ rrrrrrrrrrrrrrrrrrrrr }
\hline\hline\\\\
n/corr &  & 0 & 0.05 & 0.1 & 0.15 & 0.2 & 0.25 & 0.3 & 0.35 & 0.4 & 0.45 & 0.5 & 0.55 & 0.6 & 0.65 & 0.7 & 0.75 & 0.8 & 0.85 & 0.9 \\ 
 &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  &  \\ 
50 &  & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 \\ 
100 &  & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 \\ 
250 &  & 0.000 & 0.000 & 0.000 & 0.000 & 0.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 \\ 
500 &  & 0.000 & 0.000 & 0.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 & 1.000 \\ 
\\
\\
\hline\hline
\end{tabular}%
}
\caption{ decision  }
\end{table}