R 检查姓名是否与电子邮件中的姓名相同

R 检查姓名是否与电子邮件中的姓名相同,r,R,我想用电子邮件中的名称验证名称,我正在尝试以下解决方案,但不适合我。 目的是检查姓名是否和电子邮件中的姓名完全相同,姓名可以用(空格、逗号、点)分隔,这就是我使用分隔符的原因 df <- data.frame(name = c("Nic Hawk","tt dy","anz kpw p","timm ral","Karen Mulc","lew wey","su

我想用电子邮件中的名称验证名称,我正在尝试以下解决方案,但不适合我。 目的是检查姓名是否和电子邮件中的姓名完全相同,姓名可以用(空格、逗号、点)分隔,这就是我使用分隔符的原因

df <- data.frame(name = c("Nic Hawk","tt dy","anz kpw p","timm ral","Karen Mulc","lew wey","sun mark"),
                 email = c("Nic.Hawk@tttt.com",   "tt.dy@aquan@tttt.com", "anz.kpw.p@tttt.com",   "frez.tal@tttt.com",    "Karen.Mulc@tttt.com",   "lew.wey@tttt.com", "wall.kit@tttt.com"))


Name= "name"
Email="email"
separator = " "

df <- df %>%
  mutate(Name_match = map2_int(str_extract_all(Name, "\\w+"), 
                               str_extract_all(str_remove(Email, "\\@.*"), "\\w+"),
                               ~ +(!all(str_detect(.y, str_c(.x, collapse=" "))))))

df <- df %>%
  separate(Name,
           into = c("last_name", "first_name"),
           sep = separator,
           remove = FALSE) %>%
  mutate(first_name = tolower(first_name),
         last_name = tolower(last_name)) %>%
  mutate(name_email_match = 0L*str_detect(Email,
                                          paste0("^", first_name, separator, last_name,
                                                 "@\\w+\\.com$"))) %>%
  select(-c(first_name, last_name))

df%
变异(名字=tolower(名字),
last_name=tolower(last_name))%>%
变异(名称\u电子邮件\u匹配=0L*str\u检测(电子邮件、,
粘贴0(“^”,名字,分隔符,姓氏,
“@\\w+\\.com$”))%>%
选择(-c(名字、姓氏))
输出应该是带有1和0的变异列(1表示真(匹配),0表示假(不匹配))

试试这个:

library(dplyr)
library(stringr)

Name  <- "name"
Email <- "email"
separator <- " "

df %>%
 
 # everything to lower
 mutate(across(all_of(c(Name, Email)), tolower)) %>% 
 
 # extract interesting part from email
 mutate(email_name = str_extract(!!sym(Email), "([a-z.]+)(?=@.+)")) %>% 
 
 # replace . with separator
 mutate(email_name = str_replace_all(email_name, "\\.", separator)) %>% 
 
 # compare
 mutate(name_email_match = +(!!sym(Name) == email_name))

#>         name                email email_name name_email_match
#> 1   nic hawk    nic.hawk@tttt.com   nic hawk                1
#> 2      tt dy tt.dy@aquan@tttt.com      tt dy                1
#> 3  anz kpw p   anz.kpw.p@tttt.com  anz kpw p                1
#> 4   timm ral    frez.tal@tttt.com   frez tal                0
#> 5 karen mulc  karen.mulc@tttt.com karen mulc                1
#> 6    lew wey     lew.wey@tttt.com    lew wey                1
#> 7   sun mark    wall.kit@tttt.com   wall kit                0
这是否有效:

library(dplyr)
library(stringr)
df %>% mutate(name1 = str_remove_all(name, '\\s'), email1 = str_remove(str_remove_all(str_extract(email, '.*(?=@.*)'), '[\\.\\s]' ), '@.*')) %>% 
   mutate(op = +(str_detect(name1, email1))) %>% select(-c(name1, email1))
        name                email op
1   Nic Hawk    Nic.Hawk@tttt.com  1
2      tt dy tt.dy@aquan@tttt.com  1
3  anz kpw p   anz.kpw.p@tttt.com  1
4   timm ral    frez.tal@tttt.com  0
5 Karen Mulc  Karen.Mulc@tttt.com  1
6    lew wey     lew.wey@tttt.com  1
7   sun mark    wall.kit@tttt.com  0

我记得我已经从你那里看到了一个类似的问题。。。我认为您不太清楚如何在dplyr中使用变量名。您不需要创建变量名称和电子邮件。只要写
name
email
就可以了,因为它就在您的dplyr语句中:-)[即使这不能解决您的问题,它也会提高代码的质量]事实上,有时候数据有不同的列名和电子邮件名称,所以我根据数据中的名称为name和email提供了一个输入参数。那么你用错了。你需要这样写:
!!符号(名称)
。我的建议是在开始时重命名它们,这样您就不必编写
!!sym
每次。ok会更新,但我的代码不起作用name=“name”;名称=!!sym(名称)是这样的…??如果列名或电子邮件是空白的,我怎么能忽略Na和空白单元格呢。。你可以
过滤
它们,例如(签出
?dplyr::filter
),如果我过滤掉它们,那么它也会影响原始数据。我不想现在这样,如果数据中有NAs,你只需在
name\u email\u match
上获得一些NAs即可。你可以用零来填充它们。例如,使用
%%>%tidyr::替换(列表(名称\u电子邮件\u匹配=0))
library(dplyr)
library(stringr)
df %>% mutate(name1 = str_remove_all(name, '\\s'), email1 = str_remove(str_remove_all(str_extract(email, '.*(?=@.*)'), '[\\.\\s]' ), '@.*')) %>% 
   mutate(op = +(str_detect(name1, email1))) %>% select(-c(name1, email1))
        name                email op
1   Nic Hawk    Nic.Hawk@tttt.com  1
2      tt dy tt.dy@aquan@tttt.com  1
3  anz kpw p   anz.kpw.p@tttt.com  1
4   timm ral    frez.tal@tttt.com  0
5 Karen Mulc  Karen.Mulc@tttt.com  1
6    lew wey     lew.wey@tttt.com  1
7   sun mark    wall.kit@tttt.com  0