拆分包含“的行”|&引用;使用separate()fn未拆分
我的数据看起来像拆分包含“的行”|&引用;使用separate()fn未拆分,r,tidyr,R,Tidyr,我的数据看起来像 > company name category_list 11 1-4 All Entertainment|Games|Software 12 1.618 Technology Networking|Real Estate|Web Hosting 13
> company
name category_list
11 1-4 All Entertainment|Games|Software
12 1.618 Technology Networking|Real Estate|Web Hosting
13 1-800-DENTIST Health and Wellness
14 1-800-DOCTORS Health and Wellness
15 1-800-PublicRelations, Inc. Internet Marketing|Media|Public Relations
我必须根据这些值拆分category_list列。当值以管道分隔时,应拆分行
我使用单独的函数尝试了相同的方法,但该列没有填充任何值
c1 <- company %>% separate(category_list,into=c("primary_Sector"), sep="|")
预期产量
name category_list
11 1-4 All Entertainment
12 1-4 All Games
13 1-4 All Software
有人能告诉我怎么了吗?tidyr::separate()
进行列分隔,tidyr::separate_rows()
进行行分隔:
library(tidyr)
read.table(
text="name;category_list
1-4 All;Entertainment|Games|Software
1.618 Technology;Networking|Real Estate|Web Hosting
1-800-DENTIST;Health and Wellness
1-800-DOCTORS;Health and Wellness
1-800-PublicRelations, Inc.;Internet Marketing|Media|Public Relations",
sep=";", header = TRUE, stringsAsFactors = FALSE
) %>%
separate_rows(category_list, sep = "\\|")
## name category_list
## 1 1-4 All Entertainment
## 2 1-4 All Games
## 3 1-4 All Software
## 4 1.618 Technology Networking
## 5 1.618 Technology Real Estate
## 6 1.618 Technology Web Hosting
## 7 1-800-DENTIST Health and Wellness
## 8 1-800-DOCTORS Health and Wellness
## 9 1-800-PublicRelations, Inc. Internet Marketing
## 10 1-800-PublicRelations, Inc. Media
## 11 1-800-PublicRelations, Inc. Public Relations
逃出分离器<代码>公司%>%独立(类别列表,分为=c(“主要”部门),sep=“\\\\\”)为什么?因为根据
?sep
文档,sep
“被解释为正则表达式。”在正则表达式中,管道|
符号具有特殊含义,因此要指示文字管道,需要使用双反斜杠对其进行转义。有关正则表达式的更多信息,请参阅,但如果您知道如何按照Ronaksah的建议对其进行转义,则不需要严格了解正则表达式。现在不显示新拆分的记录。如果已经有15条记录,并且要根据拆分添加5条记录,那么现在应该有20条,但df仍然有15条记录
library(tidyr)
read.table(
text="name;category_list
1-4 All;Entertainment|Games|Software
1.618 Technology;Networking|Real Estate|Web Hosting
1-800-DENTIST;Health and Wellness
1-800-DOCTORS;Health and Wellness
1-800-PublicRelations, Inc.;Internet Marketing|Media|Public Relations",
sep=";", header = TRUE, stringsAsFactors = FALSE
) %>%
separate_rows(category_list, sep = "\\|")
## name category_list
## 1 1-4 All Entertainment
## 2 1-4 All Games
## 3 1-4 All Software
## 4 1.618 Technology Networking
## 5 1.618 Technology Real Estate
## 6 1.618 Technology Web Hosting
## 7 1-800-DENTIST Health and Wellness
## 8 1-800-DOCTORS Health and Wellness
## 9 1-800-PublicRelations, Inc. Internet Marketing
## 10 1-800-PublicRelations, Inc. Media
## 11 1-800-PublicRelations, Inc. Public Relations