Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/66.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/opencv/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
有没有办法创建Stata';带R';s merge()?_R_Stata - Fatal编程技术网

有没有办法创建Stata';带R';s merge()?

有没有办法创建Stata';带R';s merge()?,r,stata,R,Stata,Stata自动创建一个名为“_merge”的变量,指示合并后两个数据集中的匹配变量。有没有办法获得由R的merge()函数生成的变量?在Stata中\u merge的可能值是(注意merge也可以有值4和5) 在R中,您可以输入参数作为all=TRUE或all.x=TRUE或all.y=TRUE e、 g merge(x, y, by = intersect(names(x), names(y)),by.x = by, by.y = by, all = TRUE) merge(x, y, by

Stata自动创建一个名为“_merge”的变量,指示合并后两个数据集中的匹配变量。有没有办法获得由R的merge()函数生成的变量?

Stata
\u merge
的可能值是(注意
merge
也可以有值4和5)

R
中,您可以输入参数作为
all=TRUE
all.x=TRUE
all.y=TRUE

e、 g

merge(x, y, by = intersect(names(x), names(y)),by.x = by, by.y = by, all = TRUE)
 merge(x, y, by = intersect(names(x), names(y)),by.x = by, by.y = by, all.x = TRUE)
 merge(x, y, by = intersect(names(x), names(y)),by.x = by, by.y = by, all.y = TRUE)

我已经基于@Metrics answer编写了以下函数。它在结果数据集中创建一个变量“merge”,该变量与Stata一样指示观测值

stata.merge <- function(x,y, by = intersect(names(x), names(y))){

x[is.na(x)] <- Inf
y[is.na(y)] <- Inf

matched <- merge(x, y, by.x = by, by.y = by, all = TRUE)
matched <- matched[complete.cases(matched),]
matched$merge <- "matched"
master <- merge(x, y, by.x = by, by.y = by, all.x = TRUE)
master <- master[!complete.cases(master),]
master$merge <- "master"
using <- merge(x, y, by.x = by, by.y = by, all.y = TRUE)
using <- using[!complete.cases(using),]
using$merge <- "using"

df <- rbind(matched, master,using)
df[sapply(df, is.infinite)] <- NA
df
}
stata.merge这里是(我认为)前一个人的stata.merge函数的一个更简单、更高效的版本。这假设数据帧中没有名为“new1”或“new2”的变量。如果此假设错误,请更改此函数中的变量名称。此函数包含3个变量,第一个数据帧、第二个数据帧和要输入合并函数“by=”部分的值

stata.merge <- function(x,y, name){
  x$new1 <- 1
  y$new2 <- 2
  df <- merge(x,y, by = name, all = TRUE)
  df$stat.merge.variable <- rowSums(df[,c("new1", "new2")], na.rm=TRUE)
  df$new1 <- NULL
  df$new2<- NULL
  df
}

stata.merge谢谢您的回复。这相当费劲。我希望在使用merge()后创建相同的_merge变量,并对其应用摘要。好吧,
R
不是
Stata
。相关:当我阅读您的代码时,您确实假设名称
new1 new2
,但请解释可以更改。简单编辑:不要做。
df1 <- data.frame(id = letters[c(1:5,8:9)], v1=c(1:5,8:9))
df1

   id v1
1  a  1
2  b  2
3  c  3
4  d  4
5  e  5
6  h  8
7  i  9

df2 <- data.frame(id = letters[1:8], v1=c(1:7,NA))
df2

  id v1
1  a  1
2  b  2
3  c  3
4  d  4
5  e  5
6  f  6
7  g  7
8  h NA

stata.merge(df1,df2, by = "id")

   id v1.x v1.y   merge
1   a    1    1 matched
2   b    2    2 matched
3   c    3    3 matched
4   d    4    4 matched
5   e    5    5 matched
6   h    8   NA matched
7   i    9   NA  master
71  f   NA    6   using
8   g   NA    7   using
stata.merge <- function(x,y, name){
  x$new1 <- 1
  y$new2 <- 2
  df <- merge(x,y, by = name, all = TRUE)
  df$stat.merge.variable <- rowSums(df[,c("new1", "new2")], na.rm=TRUE)
  df$new1 <- NULL
  df$new2<- NULL
  df
}