Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/68.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 将观察值从一个表匹配到另一个由字符串组成的表变量_R_Stringr - Fatal编程技术网

R 将观察值从一个表匹配到另一个由字符串组成的表变量

R 将观察值从一个表匹配到另一个由字符串组成的表变量,r,stringr,R,Stringr,我有两个叫做A和B的数据集 library(data.table) Farm.Type <- c("Fruits","Vegetables","Livestock") Produce.All <- c("Apple, Orange, Pears, Strawberries","Broccoli, Cabbage, Spinach","Cow, Pig, Chicken") Store <- c("Convenience","Wholesale","Grocery","

我有两个叫做A和B的数据集

 library(data.table)
 Farm.Type <- c("Fruits","Vegetables","Livestock")
 Produce.All <- c("Apple, Orange, Pears, Strawberries","Broccoli, Cabbage, Spinach","Cow, Pig, Chicken")

 Store <- c("Convenience","Wholesale","Grocery","Market")
 Produce <- c("Oranges","Watermelon","Cabbage","Pig")
 Farm <- c("Fruits","","Vegetables","Livestock")

 A <- data.table(Farm.Type, Produce.All)
 B <- data.table(Store, Produce)
库(data.table)
农场类型
否则。(您仍然需要将其添加到B数据帧)

与:

library(purrr)
library(dplyr)
library(tidyr)

mutate(A, Produce.All=stri_split_regex(Produce.All, ", ")) %>% 
  unnest(Produce.All) -> A_long

left_join(B, A_long, by=c("Produce"="Produce.All"))
我当然希望这不是家庭作业

否则。(您仍然需要将其添加到B数据帧)

与:

library(purrr)
library(dplyr)
library(tidyr)

mutate(A, Produce.All=stri_split_regex(Produce.All, ", ")) %>% 
  unnest(Produce.All) -> A_long

left_join(B, A_long, by=c("Produce"="Produce.All"))

而且,我当然希望这不是家庭作业。

重复hrbrmstr的答案,但坚持使用
数据。表和一些基本R:

longA <- 
  stack(
    setNames(
      strsplit(A[, Produce.All], ", "),
      A[, Farm.Type]
    )
  )

merge(longA, B, by.x = "values", by.y = "Produce", all.y = TRUE)
#      values        ind       Store
#1    Cabbage Vegetables     Grocery
#2    Oranges       <NA> Convenience
#3        Pig  Livestock      Market
#4 Watermelon       <NA>   Wholesale

# Or using a data.table merge, if you like
setDT(longA)[B, on = c(values = "Produce")]

longA重复hrbrmstr的答案,但坚持使用
数据。表
和一些基本R:

longA <- 
  stack(
    setNames(
      strsplit(A[, Produce.All], ", "),
      A[, Farm.Type]
    )
  )

merge(longA, B, by.x = "values", by.y = "Produce", all.y = TRUE)
#      values        ind       Store
#1    Cabbage Vegetables     Grocery
#2    Oranges       <NA> Convenience
#3        Pig  Livestock      Market
#4 Watermelon       <NA>   Wholesale

# Or using a data.table merge, if you like
setDT(longA)[B, on = c(values = "Produce")]

longA为什么你不愿意更改表A的格式?嗨,我不是真的反对更改表A。但是,我很好奇,如果不经过转换表A的附加步骤,是否有可能的解决方案。为什么你不愿意更改表A的格式?嗨,我不是真的反对更改表A。但是,我很好奇,如果不经过转换表a的附加步骤,是否有可能的解决方案。(如果使用
数据,则不是base R。table
:-)您应该回答这个问题@Jota。感谢您的帮助。是的,我看到它在处理字符串时会变得复杂。我曾想过使用某种形式的for或while循环,并将do与grep函数结合使用,但我发现转换数据更简单。(如果使用
data.table
:-)您应该回答这个问题@Jota。感谢您的帮助。是的,我看到它在处理字符串时会变得复杂。我曾想过使用某种for或while循环,并将doing与grep函数结合使用,但我发现只转换数据是多么简单。
library(purrr)
library(dplyr)
library(tidyr)

mutate(A, Produce.All=stri_split_regex(Produce.All, ", ")) %>% 
  unnest(Produce.All) -> A_long

left_join(B, A_long, by=c("Produce"="Produce.All"))
longA <- 
  stack(
    setNames(
      strsplit(A[, Produce.All], ", "),
      A[, Farm.Type]
    )
  )

merge(longA, B, by.x = "values", by.y = "Produce", all.y = TRUE)
#      values        ind       Store
#1    Cabbage Vegetables     Grocery
#2    Oranges       <NA> Convenience
#3        Pig  Livestock      Market
#4 Watermelon       <NA>   Wholesale

# Or using a data.table merge, if you like
setDT(longA)[B, on = c(values = "Produce")]