删除r中另一列中id相同但值不同的行
我有一个数据框,如下所示:删除r中另一列中id相同但值不同的行,r,R,我有一个数据框,如下所示: id = c("a2887", "a2887", "a5511","a5511","a2806", "a1491", "a1491", "a4309", "a4309") plan = c("6V", "6V", "25HS", "50HS", "25HS", "250Mbps", "250Mbps", "15Mbps", "15Mbps") df = data.frame(id, plan) 它看起来像: id plan a2887 6
id = c("a2887", "a2887", "a5511","a5511","a2806", "a1491", "a1491", "a4309", "a4309")
plan = c("6V", "6V", "25HS", "50HS", "25HS", "250Mbps", "250Mbps", "15Mbps", "15Mbps")
df = data.frame(id, plan)
它看起来像:
id plan
a2887 6V
a2887 6V
a5511 25HS
a5511 50HS
a2806 25HS
a1491 250Mbps
a1491 250Mbps
a4309 15Mbps
a4309 15Mbps
id plan
a2887 6V
a2806 25HS
a1491 250Mbps
a4309 15Mbps
我想删除列计划中id相同但值不同的行,只保留id唯一的行/计划匹配,并创建一个新的dataframe,如下所示:
id plan
a2887 6V
a2887 6V
a5511 25HS
a5511 50HS
a2806 25HS
a1491 250Mbps
a1491 250Mbps
a4309 15Mbps
a4309 15Mbps
id plan
a2887 6V
a2806 25HS
a1491 250Mbps
a4309 15Mbps
有什么优雅的方法可以做到这一点吗?
谢谢 我们可以使用
tidyverse
。按“id”分组后,过滤
对“计划”只有一个唯一值的“id”组,并获取不同的
行
library(dplyr)
df %>%
group_by(id) %>%
filter(n_distinct(plan)==1) %>%
distinct()
# A tibble: 4 x 2
# Groups: id [4]
# id plan
# <fctr> <fctr>
#1 a2887 6V
#2 a2806 25HS
#3 a1491 250Mbps
#4 a4309 15Mbps
库(dplyr)
df%>%
分组依据(id)%>%
过滤器(n_不同(计划)=1)%>%
不同的()
#一个tibble:4x2
#组别:id[4]
#身份证计划
#
#1 a2887 6V
#2 a2806 25HS
#3 a1491 250Mbps
#4 a4309 15Mbps
数据。表
解决方案:
library(data.table)
setDT(df)
df <- unique(df)
df[, idx := .N, by = id]
df <- df[!(idx > 1), ]
df[, idx := NULL]
id plan
1: a2887 6V
2: a2806 25HS
3: a1491 250Mbps
4: a4309 15Mbps
库(data.table)
setDT(df)
df基础R溶液:
# split df into different groups by id after removing duplicates
df <- unique(df)
df <- split(df, df$id)
# keep those 'groups' with only a single row
df <- df[sapply(df, nrow) == 1]
# bind rows together
df <- do.call(rbind, df)
#删除重复项后,按id将df拆分为不同的组
df@Gregor你是对的。我编辑它来删除重复的内容。这真是太棒了。谢谢分享!