如果一列中的某个级别包含R中另一列的所有级别,如何提取所有行?
我有以下数据如果一列中的某个级别包含R中另一列的所有级别,如何提取所有行?,r,R,我有以下数据 ID INDUSTRY PRODUCT 625109 PersonalCare Neolone Preservatives 199672 PersonalCare Neolone Preservatives 227047 Pharma Optiphen 186117 Food Sasol BHT
ID INDUSTRY PRODUCT
625109 PersonalCare Neolone Preservatives
199672 PersonalCare Neolone Preservatives
227047 Pharma Optiphen
186117 Food Sasol BHT
625109 PersonalCare Optiphen
227047 Food Neolone Preservatives
如果ID同时包含新酮防腐剂和Optiphen产品,我想提取行
预期结果
ID INDUSTRY PRODUCT
625109 PersonalCare Neolone Preservatives
227047 Pharma Optiphen
625109 PersonalCare Optiphen
227047 Food Neolone Preservatives
IDs 625109和227047单独包含两种产品,因此提取。我在R怎么做 这应该有效:
library(dplyr)
df <- data.frame(ID = c(62, 19, 22, 18, 62, 22),
INDUSTRY = c("PC", "PC", "P", "F", "PC", "F"),
PRODUCT = c("NP", "NP", "O", "SB", "O", "NP"))
df %>%
group_by(ID) %>%
filter(any(PRODUCT %in% c("NP"))& any(PRODUCT %in% c("O")))
# A tibble: 4 x 3
# Groups: ID [2]
ID INDUSTRY PRODUCT
<dbl> <fctr> <fctr>
1 62 PC NP
2 22 P O
3 62 PC O
4 22 F NP
库(dplyr)
df%
分组依据(ID)%>%
过滤器(任何(产品%在%c(“NP”)中)和任何(产品%在%c(“O”)中)
#一个tibble:4x3
#组别:ID[2]
ID工业产品
162件NP
2 22 P O
3 62个人电脑
4 22 F NP
这应该可以:
library(dplyr)
df <- data.frame(ID = c(62, 19, 22, 18, 62, 22),
INDUSTRY = c("PC", "PC", "P", "F", "PC", "F"),
PRODUCT = c("NP", "NP", "O", "SB", "O", "NP"))
df %>%
group_by(ID) %>%
filter(any(PRODUCT %in% c("NP"))& any(PRODUCT %in% c("O")))
# A tibble: 4 x 3
# Groups: ID [2]
ID INDUSTRY PRODUCT
<dbl> <fctr> <fctr>
1 62 PC NP
2 22 P O
3 62 PC O
4 22 F NP
库(dplyr)
df%
分组依据(ID)%>%
过滤器(任何(产品%在%c(“NP”)中)和任何(产品%在%c(“O”)中)
#一个tibble:4x3
#组别:ID[2]
ID工业产品
162件NP
2 22 P O
3 62个人电脑
4 22 F NP
您可以使用库dplyr执行此操作
filteredData<-data %>%
filter(INDUSTRY=='PersonalCare',PRODUCT=='Optiphen')
filteredData%
过滤器(行业=='PersonalCare',产品=='Optiphen')
您可以使用库dplyr执行此操作
filteredData<-data %>%
filter(INDUSTRY=='PersonalCare',PRODUCT=='Optiphen')
filteredData%
过滤器(行业=='PersonalCare',产品=='Optiphen')
有多种方法:
在dplyr中
df %>%
group_by(ID) %>%
filter(all(c("Neolone Preservatives", "Optiphen") %in% PRODUCT))
# ID INDUSTRY PRODUCT
# <int> <chr> <chr>
#1 625109 PersonalCare Neolone Preservatives
#2 227047 Pharma Optiphen
#3 625109 PersonalCare Optiphen
#4 227047 Food Neolone Preservatives
有多种方法可以做到这一点: 在
dplyr中
df %>%
group_by(ID) %>%
filter(all(c("Neolone Preservatives", "Optiphen") %in% PRODUCT))
# ID INDUSTRY PRODUCT
# <int> <chr> <chr>
#1 625109 PersonalCare Neolone Preservatives
#2 227047 Pharma Optiphen
#3 625109 PersonalCare Optiphen
#4 227047 Food Neolone Preservatives