R 通过两列的连接进行子集划分

R 通过两列的连接进行子集划分,r,indexing,subset,multiple-columns,R,Indexing,Subset,Multiple Columns,我想同时在两列上创建数据条件的子集 与此类似: 例如: 假设我有一个名为Gamedat的数据集: Games People Hoursplayed goldeneye Michael 5 goldeneye Thatcher 8 goldeneye Dexter 12 goldeneye Dexter 15 pacman Dex

我想同时在两列上创建数据条件的子集

与此类似:

例如:

假设我有一个名为
Gamedat
的数据集:

        Games    People Hoursplayed
    goldeneye   Michael           5
    goldeneye  Thatcher           8
    goldeneye    Dexter          12
    goldeneye    Dexter          15
       pacman    Dexter           2
       tetris     Clint           5
       tetris    Dexter           8
    goldeneye  Thatcher          12
       pacman  Thatcher          15
    goldeneye     Clint           2
       pacman   Michael           5
       pacman   Michael           8
       pacman     Clint          12
       tetris      John          15
       tetris     Clint           2
 ageofempires     Clint           5
       pacman    Dexter           8
 ageofempires  Thatcher          12
 ageofempires      John          15
    goldeneye    Dexter           2
比如说我想看一场像goldeneye这样的比赛。我想看看有多少玩家玩过与他们玩过goldeneye相同时间的其他游戏(这在我的真实数据集中更有用)

所以我这样做:

 Gameofinterest <- Gamedat[ grep("goldeneye", Gamedat[ ,1]), ]`
  subset(Gamedat, Gamedat[ ,2] %in% Gameofinterest[ ,2] & 
  Gamedat[ ,3] %in% Gameofinterest[ ,3])
但这给了我:

       Games   People Hoursplayed
   goldeneye  Michael           5
   goldeneye Thatcher           8
   goldeneye   Dexter          12
   goldeneye   Dexter          15
      pacman   Dexter           2
      tetris    Clint           5
      tetris   Dexter           8
   goldeneye Thatcher          12
      pacman Thatcher          15
   goldeneye    Clint           2
      pacman  Michael           5
      pacman  Michael           8
      pacman    Clint          12
      tetris    Clint           2
ageofempires    Clint           5
      pacman   Dexter           8
ageofempires Thatcher          12
   goldeneye   Dexter           2
我真正想要的是:

         Games   People Hoursplayed
     goldeneye  Michael           5
     goldeneye Thatcher           8
     goldeneye   Dexter          12
     goldeneye   Dexter          15
        pacman   Dexter           2
     goldeneye Thatcher          12
     goldeneye    Clint           2
        pacman  Michael           5
        tetris    Clint           2
  ageofempires Thatcher          12
     goldeneye   Dexter           2
简言之,我想找到与“人和玩小时”匹配的示例

而不是“人”和“玩小时”。。。有道理吗

我知道我能做到:

 Gamedat$PHpaste <- paste(Gamedat$People, Gamedat$Hoursplayed, sep="")

 Gamedat[Gamedat[ ,4] %in% Gameofinterest[ ,4], ]

希望有更优雅的东西吗?

我想这可以通过使用
dplyr
实现。首先,使用过滤器检索游戏为goldeneye的行。然后使用
internal\u join
使用人物和播放时间与原始数据连接。可选:选择所需的列并按人排列

library(dplyr)
Gamedat %>% 
  filter(Games == "goldeneye") %>% 
  inner_join(Gamedat, by = c("People", "Hoursplayed")) %>% 
  select(Games = Games.y, People, Hoursplayed) %>% 
  arrange(People)
结果:

          Games   People Hoursplayed
1     goldeneye    Clint           2
2        tetris    Clint           2
3     goldeneye   Dexter          12
4     goldeneye   Dexter          15
5        pacman   Dexter           2
6     goldeneye   Dexter           2
7     goldeneye  Michael           5
8        pacman  Michael           5
9     goldeneye Thatcher           8
10    goldeneye Thatcher          12
11 ageofempires Thatcher          12

你期望的结果正确吗?德克斯特和吃豆人玩了2个小时,但和戈德尼耶玩了29个小时。。。是因为这29个小时中有2个是独一无二的记录的一部分吗?最后一行显示德克斯特与戈德尼耶比赛了2个小时,所以这是一场正确的比赛。
          Games   People Hoursplayed
1     goldeneye    Clint           2
2        tetris    Clint           2
3     goldeneye   Dexter          12
4     goldeneye   Dexter          15
5        pacman   Dexter           2
6     goldeneye   Dexter           2
7     goldeneye  Michael           5
8        pacman  Michael           5
9     goldeneye Thatcher           8
10    goldeneye Thatcher          12
11 ageofempires Thatcher          12