Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/spring-mvc/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 将逗号分隔的列转换为带布尔值的列_R_Csv_Dataframe - Fatal编程技术网

R 将逗号分隔的列转换为带布尔值的列

R 将逗号分隔的列转换为带布尔值的列,r,csv,dataframe,R,Csv,Dataframe,在我的data.frame的一个名为services的列中,有以下逗号分隔的数据 > dput(structure(df$services[1:5])) list("Global Expense Management, Company Privacy Policy", "Removal Services, Global Expense Management", "Removal Services, Exception & Cost Admin, Global Co

在我的data.frame的一个名为
services
的列中,有以下逗号分隔的数据

> dput(structure(df$services[1:5]))
list("Global Expense Management, Company Privacy Policy", "Removal Services, Global Expense Management", 
    "Removal Services, Exception & Cost Admin, Global Cost Estimate, Company Privacy Policy", 
    "Removal Services, Exception & Cost Admin, Ancillary Services, Global Cost Estimate, Global Expense Management, Perm Storage, Company Privacy Policy", 
    "Global Expense Management, Company Privacy Policy")
我想在我的数据框中将这些数据转换成单独的列,如果该行包含该服务,则在该服务的列下设置TRUE。否则,将该值设置为FALSE

例如,如果我希望我的数据帧如下所示:

GlobalExpenseManagement    |    CompanyPrivacyPolicy   |   etc...
TRUE                            TRUE
TRUE                            FALSE
FALSE                           TRUE
我假设我必须拆分逗号sep值,将它们分组以删除重复项,然后将它们作为
names(df)
添加到我的数据帧中。但是,我不知道如果行包含该服务,如何迭代数据集并设置true/false

有没有人有什么好主意让我们必须这样做

编辑:合并数据返回 我现在正试图将新的矩阵与现有的数据框架结合起来,用新的列对应项替换服务。我根据@plafort下面的伟大答案尝试了这一点:

names(df) <- headnames
rbind(mat, df)
名称(df)尝试:


我将从我的“SPLITSTACKFILE”包中考虑<代码> CPLITITYE < /代码>。结果是二进制“1”和“0”,而不是
TRUE
FALSE
,但这应该很容易转换

样本数据:

df <- data.frame(services = I(
  list("Global Expense Management, Company Privacy Policy", "Removal Services, Global Expense Management", 
       "Removal Services, Exception &amp; Cost Admin, Global Cost Estimate, Company Privacy Policy", 
       "Removal Services, Exception &amp; Cost Admin, Ancillary Services, Global Cost Estimate, Global Expense Management, Perm Storage, Company Privacy Policy", 
       "Global Expense Management, Company Privacy Policy")))
现在把它分开:

library(splitstackshape)
cSplit_e(df, "services", ",", type = "character", fill = 0)
##                                                                                                                                                  services
## 1                                                                                                       Global Expense Management, Company Privacy Policy
## 2                                                                                                             Removal Services, Global Expense Management
## 3                                                              Removal Services, Exception &amp; Cost Admin, Global Cost Estimate, Company Privacy Policy
## 4 Removal Services, Exception &amp; Cost Admin, Ancillary Services, Global Cost Estimate, Global Expense Management, Perm Storage, Company Privacy Policy
## 5                                                                                                       Global Expense Management, Company Privacy Policy
##   services_Ancillary Services services_Company Privacy Policy services_Exception &amp; Cost Admin
## 1                           0                               1                                   0
## 2                           0                               0                                   0
## 3                           0                               1                                   1
## 4                           1                               1                                   1
## 5                           0                               1                                   0
##   services_Global Cost Estimate services_Global Expense Management services_Perm Storage
## 1                             0                                  1                     0
## 2                             0                                  1                     0
## 3                             1                                  0                     0
## 4                             1                                  1                     1
## 5                             0                                  1                     0
##   services_Removal Services
## 1                         0
## 2                         1
## 3                         1
## 4                         1
## 5                         0

看起来棒极了。是否要将这些列及其行附加到现有data.frame的末尾?将
df[]您可以。列长度必须相同。此外,如果对象的类是混合的,那么数据将被强制为字符。如果使用函数调用
df[]Edit:
df[]try
rbind(mat,df)
,但是首先必须将列名与
名称(df)匹配这看起来很好,我试图使用
cSplit
,但没有意识到
cSplit\u e
。问题是,我的CSV数据中有特殊字符,所以它给了我
Error:中的意外符号“df我已经删除了特殊字符,但它仍然给我错误。我就是这样做的:
cleanServiceNames@user1477388,你能用
dput
在一个可重复的小例子中找出问题吗?如果可以的话,我也许能帮上忙,但除此之外,故障排除有点困难。这就是问题所在。我已经在上面
dput(结构(db$services[1:5])
,但我不知道导致错误的“特殊字符”在哪里。我认为从我的正则表达式中,除了逗号和字母之外的所有内容都应该删除。所以,除了整个数据集非常大之外,我不知道给你什么可以重新编译。@user1477388,我不确定该推荐什么。您能否检查数据集并查看是否存在任何可能导致问题的可疑行?如果是这样,请尝试使用
dput
进行测试。
colnames(mat) <- headnames

Global Expense Management Company Privacy Policy
[1,]                      TRUE                   TRUE
[2,]                      TRUE                  FALSE
[3,]                     FALSE                   TRUE
[4,]                      TRUE                   TRUE
[5,]                      TRUE                   TRUE...
df <- data.frame(services = I(
  list("Global Expense Management, Company Privacy Policy", "Removal Services, Global Expense Management", 
       "Removal Services, Exception &amp; Cost Admin, Global Cost Estimate, Company Privacy Policy", 
       "Removal Services, Exception &amp; Cost Admin, Ancillary Services, Global Cost Estimate, Global Expense Management, Perm Storage, Company Privacy Policy", 
       "Global Expense Management, Company Privacy Policy")))
df$services <- unlist(df$services)
library(splitstackshape)
cSplit_e(df, "services", ",", type = "character", fill = 0)
##                                                                                                                                                  services
## 1                                                                                                       Global Expense Management, Company Privacy Policy
## 2                                                                                                             Removal Services, Global Expense Management
## 3                                                              Removal Services, Exception &amp; Cost Admin, Global Cost Estimate, Company Privacy Policy
## 4 Removal Services, Exception &amp; Cost Admin, Ancillary Services, Global Cost Estimate, Global Expense Management, Perm Storage, Company Privacy Policy
## 5                                                                                                       Global Expense Management, Company Privacy Policy
##   services_Ancillary Services services_Company Privacy Policy services_Exception &amp; Cost Admin
## 1                           0                               1                                   0
## 2                           0                               0                                   0
## 3                           0                               1                                   1
## 4                           1                               1                                   1
## 5                           0                               1                                   0
##   services_Global Cost Estimate services_Global Expense Management services_Perm Storage
## 1                             0                                  1                     0
## 2                             0                                  1                     0
## 3                             1                                  0                     0
## 4                             1                                  1                     1
## 5                             0                                  1                     0
##   services_Removal Services
## 1                         0
## 2                         1
## 3                         1
## 4                         1
## 5                         0