R 创建按字母顺序将信息合并到两列中的新列
我有一个足球队数据集,如下所示:R 创建按字母顺序将信息合并到两列中的新列,r,dplyr,R,Dplyr,我有一个足球队数据集,如下所示: Home_team Away_team Home_score Away_score Arsenal Chelsea 1 3 Manchester U Blackburn 2 9 Liverpool Leeds 0 8 Chelsea Arsenal 4 1 我想把参加比赛的球队分组,不管是主场还是客场。例如,如果切尔西与阿森纳交手,不管比赛是在切尔西还是在
Home_team Away_team Home_score Away_score
Arsenal Chelsea 1 3
Manchester U Blackburn 2 9
Liverpool Leeds 0 8
Chelsea Arsenal 4 1
我想把参加比赛的球队分组,不管是主场还是客场。例如,如果切尔西与阿森纳交手,不管比赛是在切尔西还是在阿森纳,我希望新的专栏“球队参与”是阿森纳-切尔西。我想这样做的方法是按字母顺序将这些团队添加到新列中,但我不知道如何做到这一点
期望输出:
Home_team Away_team Home_score Away_score teams_involved
Arsenal Chelsea 1 3 Arsenal - Chelsea
Manchester U Blackburn 2 9 Blackburn - Manchester U
Liverpool Leeds 0 8 Leeds - Liverpool
Chelsea Arsenal 4 1 Arsenal - Chelsea
我这样做的原因是,无论比赛地点如何,我都能看到每支球队对某支球队的胜利。一个简单的ifelse语句也可以:
df = read.table(text = "
Home_team Away_team Home_score Away_score
Arsenal Chelsea 1 3
ManchesterU Blackburn 2 9
Liverpool Leeds 0 8
Chelsea Arsenal 4 1
", header=T, stringsAsFactors=F)
library(dplyr)
df %>%
rowwise() %>% # for each row
mutate(Teams = paste(sort(c(Home_team, Away_team)), collapse = " - ")) %>% # sort the teams alphabetically and then combine them separating with -
ungroup() # forget the row grouping
# # A tibble: 4 x 5
# Home_team Away_team Home_score Away_score Teams
# <chr> <chr> <int> <int> <chr>
# 1 Arsenal Chelsea 1 3 Arsenal - Chelsea
# 2 ManchesterU Blackburn 2 9 Blackburn - ManchesterU
# 3 Liverpool Leeds 0 8 Leeds - Liverpool
# 4 Chelsea Arsenal 4 1 Arsenal - Chelsea
df$teams_involved <- ifelse(df$Home_team > df$Away_team,
paste(df$Away_team, df$Home_team, sep = " - "),
paste(df$Home_team, df$Away_team, sep = " - "))
df$teams\u涉及df$Away\u团队,
粘贴(df$客场团队,df$主场团队,sep=“-”),
粘贴(df$主队、df$客队、sep=“-”)
我们可以使用map2
在行中循环,并按字母顺序对“主队”、“客队”列的元素进行排序
library(tidyverse)
df %>%
mutate(Teams = map2(Home_team, Away_team, ~
paste(sort(c(.x, .y)), collapse= ' - ')))
# Home_team Away_team Home_score Away_score Teams
#1 Arsenal Chelsea 1 3 Arsenal - Chelsea
#2 ManchesterU Blackburn 2 9 Blackburn - ManchesterU
#3 Liverpool Leeds 0 8 Leeds - Liverpool
#4 Chelsea Arsenal 4 1 Arsenal - Chelsea
或者另一个选项是pmin/pmax
df %>%
mutate(Teams = paste(pmin(Home_team, Away_team),
pmax(Home_team, Away_team), sep= " - "))
或使用base R
df$Teams <- paste(do.call(pmin, df[1:2]), do.call(pmax, df[1:2]), sep= ' - ')
df$Teams@Ronak这不是一个重复的问题,因为这个问题是在一个{dplyr}解决方案之后提出的。
df$Teams <- paste(do.call(pmin, df[1:2]), do.call(pmax, df[1:2]), sep= ' - ')
df <- structure(list(Home_team = c("Arsenal", "ManchesterU", "Liverpool",
"Chelsea"), Away_team = c("Chelsea", "Blackburn", "Leeds", "Arsenal"
), Home_score = c(1L, 2L, 0L, 4L), Away_score = c(3L, 9L, 8L,
1L)), .Names = c("Home_team", "Away_team", "Home_score", "Away_score"
), class = "data.frame", row.names = c(NA, -4L))