从R中的文本字符串中提取N个匹配项?
我在R中使用stringr,我有一个文本字符串,列出了新闻文章的标题。我想提取这些标题,但只提取出现的前N个标题。在我的文本字符串示例中,我有三个文章标题,但我只想提取前两个 我如何告诉Stru_extract只收集前两个标题?多谢各位 下面是我当前的代码和示例文本从R中的文本字符串中提取N个匹配项?,r,text,text-mining,stringr,R,Text,Text Mining,Stringr,我在R中使用stringr,我有一个文本字符串,列出了新闻文章的标题。我想提取这些标题,但只提取出现的前N个标题。在我的文本字符串示例中,我有三个文章标题,但我只想提取前两个 我如何告诉Stru_extract只收集前两个标题?多谢各位 下面是我当前的代码和示例文本 library(stringr) 下面是示例文本 texting <- ("Time: Friday, September 14, 2018 4:34:00 PM EDT\r\nJob Number: 73591483\r\
library(stringr)
下面是示例文本
texting <- ("Time: Friday, September 14, 2018 4:34:00 PM EDT\r\nJob Number: 73591483\r\nDocuments (100)\r\n 1. U.S. Stocks Rebound Slightly After Tech-Driven Slump\r\n Client/Matter: -None-\r\n Search Terms: trade war or US-China trade or china tariff and not dealbook\r\n Search Type: Terms and Connectors\r\n Narrowed by:\r\n Content Type Narrowed by\r\n News Sources: The New York Times; Content Type: News;\r\n Timeline: Jan 01, 2018 to Dec 31, 2018\r\n 2. Shifting Strategy on Tariffs\r\n Client/Matter: -None-\r\n Search Terms: trade war or US-China trade or china tariff and not dealbook\r\n 100. Example")
我只想让它收集前两个匹配项。您可以使用选项
simplify=TRUE
获取向量作为结果,而不是列表。然后,从向量中选取前N个元素
titles.1 <- str_extract_all(texting, "\\d+\\.\\s.+", simplify = TRUE)[1:2]
标题。1
[[1]]
[1] "1. U.S. Stocks Rebound Slightly After Tech-Driven Slump"
[2] "2. Shifting Strategy on Tariffs"
[3] "100. Example"
titles.1 <- str_extract_all(texting, "\\d+\\.\\s.+", simplify = TRUE)[1:2]