Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/variables/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/ms-access/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
loop:为r中的相关函数选择变量_R_Variables_Loops_Dataframe - Fatal编程技术网

loop:为r中的相关函数选择变量

loop:为r中的相关函数选择变量,r,variables,loops,dataframe,R,Variables,Loops,Dataframe,以下是我打算做的(对于相当多的变量和数据集): 第二个数据集-包含实际数据 set.seed(1234) dataf <- data.frame (yvar = rnorm (10, 10,3), A = sample(c(1,0), 10, T), B = sample(c(1,0), 10, T), c1 = sample (c(1,0), 10, T), D2 = sample (c(1,0), 10, T), E= sample (c(1,0), 10

以下是我打算做的(对于相当多的变量和数据集):

第二个数据集-包含实际数据

set.seed(1234)
dataf <- data.frame (yvar = rnorm (10, 10,3), 
    A = sample(c(1,0), 10, T), B = sample(c(1,0), 10, T), 
    c1 = sample (c(1,0), 10, T), D2 = sample (c(1,0), 10, T), 
    E= sample (c(1,0), 10, T),F = sample (c(1,0), T), 
    g1 = sample (c(1,0), 10, T))

# manual workout:
xtemp <- dataf$A* dataf$B * dataf$c1 # all from group 1
# I error in previous version it is * not + 
# (is product of all members of a group i.e. 
 xtemp <- dataf$D2 (- group 2)
 xtemp <- dataf$E * dataf$F (- group 3)
 xtemp <- dataf$G (- group 4)
set.seed(1234)
我打赌

corrfun <- function (group.no, x=dataf, x.lookup=mygroupdf) {
  xtemp <- apply(x[x.lookup$varname[x.lookup$group == group.no]], 1, prod)

  out <- cor(x$yvar, xtemp)

  return (out)
}

>     corrfun(1)
[1] 0.35593
> corrfun(2)
[1] 0.4181311
> 
corrfun
另一个答案

cbind(
  group = unique(mygroupdf$group),
  corr = 
    do.call(
      c,
      lapply(
        unique(mygroupdf$group),
        function(x) {
          varnames <- unique(mygroupdf[mygroupdf$group == x, 'varname'])
          products <- apply(as.matrix(dataf[, colnames(dataf) %in% varnames]), 1, prod)
          cor(products, dataf$yvar)
        }
      )
    )
)
sapply(唯一(mygroupdf$group),函数(x){

a并使用我当前最喜欢的库创建另一个答案:

library(plyr)
ddply(mygroupdf, .(group), summarise,
      cor=cor(dataf$yvar, apply(dataf[as.character(varname)],1,prod)))
这将产生以下结果:

组cor
1     1  0.3559300
2     2  0.4181311
33NA
4     4 -0.1015003
警告信息:
在cor(dataf$yvar,apply)(dataf[as.character(varname)],1,prod))中:
标准偏差为零

当你说不同的组有不同的变量时,我不确定我是否理解。因为这是一个
数据框
,dim不是会相同吗?你是说不同的变量名吗?@Maiasaura请看我最近的编辑,实际上我有打字错误,它是“*”而不是“+”在创建xtemp时。有两个数据集。问题是在创建变量xtemp时,可能有n个变量。只有两个组(0,1)对吗?每个组都有相同的变量集(A、B、c1、D2、E、F),对吗?那么为什么一个函数不能完成所有任务呢?我放弃了。根本无法遵循您的示例。感谢您的回答和猜测(确实是正确的猜测!!!),我们可以将其放入循环中而不是键入group。每次都不可以。您可以或您可以使用
sapply(unique(mygroupdf$group),corrfun)
您的代码将因一个子文件错误而失败:在默认设置下,
mygroupdf$varname
将是一个因子,并且使用因子对数据进行子集设置。带有因子的框架将使用其数值,而不是其字符解释。您将获得正确的输出格式,但是错误的数字。@MvG很好!我有
选项(stringsAsFactors=FALSE)
在我的
.Rprofile
中,所以我忘了那些东西!
   corrfun <- function (x, V1, V2, V3) {
           xtemp <- V1 * V2  + V3
           x <- cor(dataf$yvar, xtemp)
           return (x)
          }
corrfun <- function (group.no, x=dataf, x.lookup=mygroupdf) {
  xtemp <- apply(x[x.lookup$varname[x.lookup$group == group.no]], 1, prod)

  out <- cor(x$yvar, xtemp)

  return (out)
}

>     corrfun(1)
[1] 0.35593
> corrfun(2)
[1] 0.4181311
> 
cbind(
  group = unique(mygroupdf$group),
  corr = 
    do.call(
      c,
      lapply(
        unique(mygroupdf$group),
        function(x) {
          varnames <- unique(mygroupdf[mygroupdf$group == x, 'varname'])
          products <- apply(as.matrix(dataf[, colnames(dataf) %in% varnames]), 1, prod)
          cor(products, dataf$yvar)
        }
      )
    )
)
     group       corr
[1,]     1  0.3559300
[2,]     2  0.4181311
[3,]     3         NA
[4,]     4 -0.1015003
sapply(unique(mygroupdf$group), function(x) {
  a <- as.character(mygroupdf$varname[mygroupdf$group == x])
  cor(dataf$yvar, apply(dataf[a],1,prod))
})
library(plyr)
ddply(mygroupdf, .(group), summarise,
      cor=cor(dataf$yvar, apply(dataf[as.character(varname)],1,prod)))