如何从R中的列表创建列表列? #示例数据 df% 变异(节点=(xml\u str%%>%read\u xml()%%>%xml\u find\u all(,“/@d”)%%>%as\u list())

如何从R中的列表创建列表列? #示例数据 df% 变异(节点=(xml\u str%%>%read\u xml()%%>%xml\u find\u all(,“/@d”)%%>%as\u list()),r,purrr,xml2,R,Purrr,Xml2,对于上面的数据框,我想从xml字符串中提取所有路径元素d节点,并将它们存储为同一数据框中的列表,但我得到的列节点的长度必须为1(组大小),而不是7 mutate语句中使用的管道确实返回一个列表 我可以省略“rowwise()”,但它只需要长度2而不是1 我错过了什么 这与您的操作方式不完全相同,但您可以使用str\u extract\u all和regex将相关字符串作为逗号分隔字符串列表拉出 # Sample data df <- tibble(id=1:2, xml_str=c("&l

对于上面的数据框,我想从xml字符串中提取所有路径元素d节点,并将它们存储为同一数据框中的列表,但我得到的列
节点
的长度必须为1(组大小),而不是7

mutate语句中使用的管道确实返回一个列表

我可以省略“rowwise()”,但它只需要长度2而不是1


我错过了什么

这与您的操作方式不完全相同,但您可以使用
str\u extract\u all
和regex将相关字符串作为逗号分隔字符串列表拉出

# Sample data
df <- tibble(id=1:2, xml_str=c("<?xml version='1.0'?><!DOCTYPE svg PUBLIC '-//W3C//DTD SVG 1.1//EN' 'http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd'><svg version='1.1' xmlns='http://www.w3.org/2000/svg'>'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M171, 160 L171, 160, 168, 159, 164, 159, 163, 159, 162, 159, 161, 159, 161, 158, 162, 158, 162, 157, 163, 156, 165, 156'/>'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M172, 226 L172, 226, 171, 213, 170, 212, 171, 212, 172, 212, 173, 212, 173, 211, 172, 211, 171, 211, 171, 212, 171, 215'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M153, 94 L153, 94, 150, 90, 150, 89, 150, 88, 150, 87, 150, 86, 150, 85, 150, 84, 150, 82, 150, 81, 150, 80, 150, 79'/>'/>'/>'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M346, 84 L346, 84, 346, 79, 347, 78, 347, 77, 348, 77, 348, 76, 348, 75, 348, 76, 348, 77, 349, 77, 348, 78'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M314, 67 L314, 67, 311, 76, 309, 76, 308, 77, 307, 77, 307, 76, 306, 76, 305, 76, 305, 77, 306, 77, 307, 77, 306, 77, 305, 79, 304, 80'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M313, 57 L313, 57, 321, 56, 321, 57, 321, 58'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M332, 58 L332, 58, 332, 57, 331, 57, 333, 57, 334, 57, 335, 57, 336, 58, 337, 58, 338, 58, 339, 58, 340, 58, 341, 58, 341, 59, 340, 60, 339, 60, 338, 60, 337, 60, 336, 60, 335, 60, 334, 60, 333, 60, 332, 60, 331, 60, 331, 59, 333, 58, 334, 58'/></svg>", "<?xml version='1.0'?><!DOCTYPE svg PUBLIC '-//W3C//DTD SVG 1.1//EN' 'http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd'><svg version='1.1' xmlns='http://www.w3.org/2000/svg'>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 80 L315, 80, 321, 79, 320, 79, 318, 79, 317, 79'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M334, 83 L334, 83, 334, 82'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 80 L315, 80, 315, 82, 315, 83, 315, 84, 315, 85'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 72 L315, 72'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 69 L315, 69, 315, 70'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M332, 66 L332, 66, 332, 67'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 56 L315, 56'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 66 L315, 66, 315, 67'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 72 L315, 72'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M332, 72 L332, 72, 333, 75'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 72 L315, 72'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M334, 73 L334, 73, 333, 73'/></svg>"))

df <- df %>% 
  rowwise() %>% 
  mutate(nodes = (xml_str %>% read_xml() %>% xml_find_all(., "//@d") %>% as_list()))
ans%

dplyr::mutate(dnodes=stringr::str_extract_all(xml_str,“(?这与您的操作方式不完全一样,但是您可以使用
str_extract_all
和regex将相关字符串作为逗号分隔的字符串列表拉出

# Sample data
df <- tibble(id=1:2, xml_str=c("<?xml version='1.0'?><!DOCTYPE svg PUBLIC '-//W3C//DTD SVG 1.1//EN' 'http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd'><svg version='1.1' xmlns='http://www.w3.org/2000/svg'>'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M171, 160 L171, 160, 168, 159, 164, 159, 163, 159, 162, 159, 161, 159, 161, 158, 162, 158, 162, 157, 163, 156, 165, 156'/>'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M172, 226 L172, 226, 171, 213, 170, 212, 171, 212, 172, 212, 173, 212, 173, 211, 172, 211, 171, 211, 171, 212, 171, 215'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M153, 94 L153, 94, 150, 90, 150, 89, 150, 88, 150, 87, 150, 86, 150, 85, 150, 84, 150, 82, 150, 81, 150, 80, 150, 79'/>'/>'/>'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M346, 84 L346, 84, 346, 79, 347, 78, 347, 77, 348, 77, 348, 76, 348, 75, 348, 76, 348, 77, 349, 77, 348, 78'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M314, 67 L314, 67, 311, 76, 309, 76, 308, 77, 307, 77, 307, 76, 306, 76, 305, 76, 305, 77, 306, 77, 307, 77, 306, 77, 305, 79, 304, 80'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M313, 57 L313, 57, 321, 56, 321, 57, 321, 58'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M332, 58 L332, 58, 332, 57, 331, 57, 333, 57, 334, 57, 335, 57, 336, 58, 337, 58, 338, 58, 339, 58, 340, 58, 341, 58, 341, 59, 340, 60, 339, 60, 338, 60, 337, 60, 336, 60, 335, 60, 334, 60, 333, 60, 332, 60, 331, 60, 331, 59, 333, 58, 334, 58'/></svg>", "<?xml version='1.0'?><!DOCTYPE svg PUBLIC '-//W3C//DTD SVG 1.1//EN' 'http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd'><svg version='1.1' xmlns='http://www.w3.org/2000/svg'>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 80 L315, 80, 321, 79, 320, 79, 318, 79, 317, 79'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M334, 83 L334, 83, 334, 82'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 80 L315, 80, 315, 82, 315, 83, 315, 84, 315, 85'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 72 L315, 72'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 69 L315, 69, 315, 70'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M332, 66 L332, 66, 332, 67'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 56 L315, 56'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 66 L315, 66, 315, 67'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 72 L315, 72'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M332, 72 L332, 72, 333, 75'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M315, 72 L315, 72'/>\n<path fill='none' stroke='#ff0000' stroke-width='5' d='M334, 73 L334, 73, 333, 73'/></svg>"))

df <- df %>% 
  rowwise() %>% 
  mutate(nodes = (xml_str %>% read_xml() %>% xml_find_all(., "//@d") %>% as_list()))
ans%

dplyr::mutate(dnodes=stringr::str_extract_all(xml_str,“(?这是你想要的吗?我通常将我的
mutate(name=right_side)
的右侧包装在
list()
中以完成这一任务

ans <- 
  df %>%
    dplyr::mutate(dnodes = stringr::str_extract_all(xml_str, "(?<=[d]=')[^']+(?='\\/)")) %>%
    dplyr::mutate(dnodes = purrr::map(dnodes, ~unlist(strsplit(paste(.x, collapse=", "), ", "))))

ans$dnodes
# [[1]]
  # [1] "M171"     "160 L171" "160"      "168"      "159"      "164"      "159"      "163"      "159"      "162"     
 # [11] "159"      "161"      "159"      "161"      "158"      "162"      "158"      "162"      "157"      "163"     
 # [21] "156"      "165"      "156"      "M172"     "226 L172" "226"      "171"      "213"      "170"      "212"     
 # [31] "171"      "212"      "172"      "212"      "173"      "212"      "173"      "211"      "172"      "211"     
 # [41] "171"      "211"      "171"      "212"      "171"      "215"      "M153"     "94 L153"  "94"       "150"     
 # [51] "90"       "150"      "89"       "150"      "88"       "150"      "87"       "150"      "86"       "150"     
 # [61] "85"       "150"      "84"       "150"      "82"       "150"      "81"       "150"      "80"       "150" 
 # etc
df%
变异(节点=列表(xml\u str%>%read\u xml()%>%xml\u find\u all(,“/@d”))
类(df$节点)
“列表”
类(df$nodes[[1]])
“xml_节点集”

不确定您是否需要
xml\u节点集
对象,或者可能CPak的解决方案与实际字符串更适合您。

这是否符合您的要求?我通常将
mutate(name=right\u side)
的右侧包装在
list()
中以实现这一点

ans <- 
  df %>%
    dplyr::mutate(dnodes = stringr::str_extract_all(xml_str, "(?<=[d]=')[^']+(?='\\/)")) %>%
    dplyr::mutate(dnodes = purrr::map(dnodes, ~unlist(strsplit(paste(.x, collapse=", "), ", "))))

ans$dnodes
# [[1]]
  # [1] "M171"     "160 L171" "160"      "168"      "159"      "164"      "159"      "163"      "159"      "162"     
 # [11] "159"      "161"      "159"      "161"      "158"      "162"      "158"      "162"      "157"      "163"     
 # [21] "156"      "165"      "156"      "M172"     "226 L172" "226"      "171"      "213"      "170"      "212"     
 # [31] "171"      "212"      "172"      "212"      "173"      "212"      "173"      "211"      "172"      "211"     
 # [41] "171"      "211"      "171"      "212"      "171"      "215"      "M153"     "94 L153"  "94"       "150"     
 # [51] "90"       "150"      "89"       "150"      "88"       "150"      "87"       "150"      "86"       "150"     
 # [61] "85"       "150"      "84"       "150"      "82"       "150"      "81"       "150"      "80"       "150" 
 # etc
df%
变异(节点=列表(xml\u str%>%read\u xml()%>%xml\u find\u all(,“/@d”))
类(df$节点)
“列表”
类(df$nodes[[1]])
“xml_节点集”

不确定您是否需要
xml\u nodeset
对象,或者可能CPak的解决方案(使用实际字符串)更适合您。

它确实--有趣的是,as\u list()函数没有这样做。它确实--有趣的是,as\u list()函数函数没有这样做。我认为也许reg exp是一种方法,但我有65000个观测值,每个观测值可能有5个或10个d节点…但是,我再次怀疑xml_find_all在场景后面也使用某种reg exp我认为也许reg exp是一种方法,但我有65000个观测值,每个观测值可能有5个或10个d节点…b但是,我再次怀疑xml\u find\u all在幕后也使用了某种reg exp