Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/77.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 计算、列排列并选择“在”数据框内_R_Dataframe_Plyr - Fatal编程技术网

R 计算、列排列并选择“在”数据框内

R 计算、列排列并选择“在”数据框内,r,dataframe,plyr,R,Dataframe,Plyr,我经常从一些计算中得到一个数据帧,我想在输出之前对其进行清理、重命名和列排列。下面的所有版本都有效,最接近的是simple data.frame 是否有一种方法可以将INTERNAIN和mutate的数据帧内计算与data.frame的列顺序保留结合起来,而不必在末尾添加多余的[,…] library(plyr) # Given this chaotically named data.frame d = expand.grid(VISIT=as.factor(1:2),Biochem=let

我经常从一些计算中得到一个数据帧,我想在输出之前对其进行清理、重命名和列排列。下面的所有版本都有效,最接近的是simple data.frame

是否有一种方法可以将INTERNAIN和mutate的数据帧内计算与data.frame的列顺序保留结合起来,而不必在末尾添加多余的[,…]

library(plyr) 

# Given this chaotically named data.frame
d = expand.grid(VISIT=as.factor(1:2),Biochem=letters[1:2],time=1:5,
                subj=as.factor(1:3))
d$Value1 =round(rnorm(nrow(d)),2)
d$val2 = round(rnorm(nrow(d)),2)

# I would like to cleanup, compute and rearrange columns

# Simple and almost perfect
dDataframe = with(d, data.frame(
  biochem = Biochem,
  subj = subj,
  visit = VISIT,
  value1 = Value1*3 
))
# This simple solution is almost perfect, 
# but requires one more line
dDataframe$value2 = dDataframe$value1*d$val2

# For the following methods I have to reorder 
# and select in a second step

# use mutate from plyr to allow computation on computed values,
# which transform cannot do.
dMutate =   mutate(d,
  biochem = Biochem,
  subj = subj,
  visit = VISIT,
  value1 = Value1*3, #assume this is a time consuming function
  value2 = value1*val2
  # Could set fields = NULL here to remove,
  # but this does not help getting column order
)[,c("biochem","subj","visit","value1","value2")]

# use within. Same problem, order not preserved
dWithin = within(d, {
  biochem = Biochem
  subj = subj
  visit = VISIT
  value1 = Value1*3
  value2 = value1*val2       
})[,c("biochem","subj","visit","value1","value2")]


all.equal(dDataframe,dWithin)
all.equal(dDataframe,dMutate)
您可以使用plyr包中的Summary或Summary。从文档:

Summary以一种分析的方式进行转换,除了不向现有数据框添加列之外,它还创建了一个新的数据框。[……]

例如:

library(plyr)
summarize(d,
  biochem = Biochem,
  subj    = subj,
  visit   = VISIT,
  value1  = Value1 * 3,
  value2  = value1 * val2       
)
您可以使用plyr包中的Summary或Summary。从文档:

Summary以一种分析的方式进行转换,除了不向现有数据框添加列之外,它还创建了一个新的数据框。[……]

例如:

library(plyr)
summarize(d,
  biochem = Biochem,
  subj    = subj,
  visit   = VISIT,
  value1  = Value1 * 3,
  value2  = value1 * val2       
)

如果您愿意移到data.table,则可以通过引用执行这些操作的大部分,并避免与[关联的复制如果您愿意移到data.table,则可以通过引用执行这些操作的大部分,并避免与[关联的复制[在您的简单数据框中,我将使用名称您可以使用mnel引入的技巧,函数为….value1=val1@mnel的技巧是一个有趣的技巧,我不知道,但没有什么比总结更有趣的。在您的简单数据框中,我将使用名称您可以使用mnel引入的技巧,函数为….value1=val1@mnel的技巧是一个成功的例子有趣的一个我不知道,但没有什么比总结更好。我知道我错过了一些明显的东西!我知道我错过了一些明显的东西!这比简单的data.frame解决方案要详细得多。为了效率,它可能是好的,但在清晰性方面总结无疑是赢家。@Dietermene-我添加了一个更简单的内存效率更低的方法oachThat看起来要好得多。我记得:永远不要忘记data.table中的列表。这比简单的data.frame解决方案要详细得多。为了效率,它可能很好,但在清晰性方面,最终总结是赢家。@Dietermene-我添加了一种更简单、内存效率更低的方法,它看起来要好得多。我记得:永远不要忘记他在data.table中列出了这个列表。
DT <- data.table(d)
DT[,list(  biochem = Biochem,   
    subj    = subj,
   visit   = VISIT,
   value1 = value1  <- Value1 * 3,
   value2  = value1 * val2       
   )]