Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/84.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R中满足条件的重复行_R_Duplicates_Row_Conditional Statements - Fatal编程技术网

R中满足条件的重复行

R中满足条件的重复行,r,duplicates,row,conditional-statements,R,Duplicates,Row,Conditional Statements,我想将R中的一个类别映射到多个类别 我有一个数据框 Region var1 var2 Texas XX XX Texas XX XX 我需要将德克萨斯州重新标记为“达拉斯”和“休斯顿”,换句话说,“达拉斯”和“休斯顿”将共享var1和var2的相同值 如何创建这样的数据帧: Region var1 var2 Region2 Texas XX XX Dallas Texas XX XX Dallas Texas XX XX Houston Texas

我想将R中的一个类别映射到多个类别

我有一个数据框

Region var1 var2
Texas  XX   XX 
Texas  XX   XX
我需要将德克萨斯州重新标记为“达拉斯”和“休斯顿”,换句话说,“达拉斯”和“休斯顿”将共享var1和var2的相同值

如何创建这样的数据帧:

Region var1 var2 Region2
Texas  XX   XX   Dallas
Texas  XX   XX   Dallas
Texas  XX   XX   Houston
Texas  XX   XX   Houston

如果您为新区域创建单独的表,则这应该涉及一些行的复制,条件是Region==Texas?

本质上是一个
合并操作:

big <- data.frame(Region=rep("Texas",2), Region2=c("Dallas","Houston"))
merge(dat,big)
#  Region var1 var2 Region2
#1  Texas   XX   XX  Dallas
#2  Texas   XX   XX Houston
#3  Texas   XX   XX  Dallas
#4  Texas   XX   XX Houston

big如果为新区域创建单独的表,则本质上是一个
合并操作:

big <- data.frame(Region=rep("Texas",2), Region2=c("Dallas","Houston"))
merge(dat,big)
#  Region var1 var2 Region2
#1  Texas   XX   XX  Dallas
#2  Texas   XX   XX Houston
#3  Texas   XX   XX  Dallas
#4  Texas   XX   XX Houston

big没有
merge
的另一个选项是通过创建“Region2”来转换数据集,并复制行序列以扩展它

transform(df1, Region2 = c("Dallas", "Houston"))[rep(seq_len(nrow(df1)), each = 2), ]

另一个不使用
merge
的选项是通过创建“Region2”来
转换数据集,并复制行序列以扩展它

transform(df1, Region2 = c("Dallas", "Houston"))[rep(seq_len(nrow(df1)), each = 2), ]

使用
dplyr
,假设您有一个带有子区域的数据帧:

library(dplyr)
df <- data.frame(
    Region = c("Texas", "Texas"),
    var1 = c("XX", "XX"),
    var2 = c("XX", "XX")
    )

regions <- data.frame(
    Region = c("Texas", "Texas"),
    Region2 = c("Houston", "Dallas")
    )

df %>% right_join(regions, by = "Region")
  Region var1 var2 Region2
1  Texas   XX   XX Houston
2  Texas   XX   XX Houston
3  Texas   XX   XX  Dallas
4  Texas   XX   XX  Dallas
库(dplyr)

df与
dplyr
,假设您有一个带有子区域的数据帧:

library(dplyr)
df <- data.frame(
    Region = c("Texas", "Texas"),
    var1 = c("XX", "XX"),
    var2 = c("XX", "XX")
    )

regions <- data.frame(
    Region = c("Texas", "Texas"),
    Region2 = c("Houston", "Dallas")
    )

df %>% right_join(regions, by = "Region")
  Region var1 var2 Region2
1  Texas   XX   XX Houston
2  Texas   XX   XX Houston
3  Texas   XX   XX  Dallas
4  Texas   XX   XX  Dallas
库(dplyr)
df