R dplyr过滤器基于将搜索词与选择列中任何作品的第一个词相匹配_R_Regex_Dplyr - Fatal编程技术网

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/84.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R dplyr过滤器基于将搜索词与选择列中任何作品的第一个词相匹配_R_Regex_Dplyr - Fatal编程技术网

R dplyr过滤器基于将搜索词与选择列中任何作品的第一个词相匹配

r regex

R dplyr过滤器基于将搜索词与选择列中任何作品的第一个词相匹配,r,regex,dplyr,R,Regex,Dplyr,我试图根据文本中以匹配特定正则表达式的单词开头的关键字来筛选选定列中的单词。在这里，我试图挑选所有以“bio”或“15”开头的单词。但是，搜索词也可以在一些词的中间找到，比如名称栏的共生体和代码列161540。p> **Name** **Code** Biofuel is good 159403 Bioecological is good 161540 Probiotics is good 159883 Good is

我试图根据文本中以匹配特定正则表达式的单词开头的关键字来筛选选定列中的单词。在这里，我试图挑选所有以“bio”或“15”开头的单词。但是，搜索词也可以在一些词的中间找到，比如名称栏的共生体和代码列161540。p>

**Name**                     **Code**
Biofuel is good          159403
Bioecological is good    161540
Probiotics is good       159883
Good is symbiotic        1877447

我试过下面的代码

Innov_filter <- Innov_Data %>% 
  select(everything()) %>% 
  filter(str_detect(str_to_lower(Name), "bio") | str_detect(str_to_lower(Code), "bio"))

Innov_过滤器%
选择（所有内容（））%>%
过滤器（str_-detect（str_-to_-lower（名称），“bio”）| str_-detect（str_-to_-lower（代码），“bio”））

但是，这不起作用，因为它正在过滤不符合任何条件的最后一行。我将感谢您的帮助，在严格的搜索基础上的第一次出现的搜索词的一部分，而不仅仅是在任何位置的词

感谢在您可以使用的行开头过滤“bio”，例如使用函数

grepl（）

：

grepl（）

第一个参数

^bio

中的

指示匹配字符串必须以字母“bio”开头。使用包

stringr

，它将成为：

df %>%
    filter(str_detect(tolower(Name), "^bio"))
#>                    Name   Code
#> 1       Biofuel is good 159403
#> 2 Bioecological is good 161540

顺便说一下，在您的工作流中使用

select（everything（））

是可选的，因为默认情况下

dplyr

保留所有列，并应用

filter（）

函数考虑所有列。

要过滤行开头的“bio”，您可以使用例如函数

grepl（）

：

grepl（）

第一个参数

^bio

中的

指示匹配字符串必须以字母“bio”开头。使用包

stringr

，它将成为：

df %>%
    filter(str_detect(tolower(Name), "^bio"))
#>                    Name   Code
#> 1       Biofuel is good 159403
#> 2 Bioecological is good 161540

顺便说一下，在您的工作流中使用

select（everything（））

是可选的，因为默认情况下

dplyr

保留所有列，并应用

filter（）

函数考虑所有列。

可能类似于此，第一个单词是bio，代码中是15：

library(dplyr)
df %>%
  filter(str_detect(tolower(Name), "^bio") | str_detect(tolower(Code), "15")) 

                   Name   Code
1       Biofuel is good 159403
2 Bioecological is good 161540
3    Probiotics is good 159883

使用重命名的数据：

df <-read.table(text = "Name                     Code
  'Biofuel is good'          159403
                'Bioecological is good'    161540
                'Probiotics is good'       159883
                'Good is symbiotic'        1877447", header = T)

df可能是这样的，第一个单词是bio，代码是15：
library(dplyr)
df %>%
  filter(str_detect(tolower(Name), "^bio") | str_detect(tolower(Code), "15")) 

                   Name   Code
1       Biofuel is good 159403
2 Bioecological is good 161540
3    Probiotics is good 159883

使用重命名的数据：
df <-read.table(text = "Name                     Code
  'Biofuel is good'          159403
                'Bioecological is good'    161540
                'Probiotics is good'       159883
                'Good is symbiotic'        1877447", header = T)

df编辑
如果我们想选择任何以“bio”开头的单词，我们可以这样做
df %>%
  filter(str_detect(str_to_lower(Name), "\\bbio") | str_detect(Code, "^15"))

df[sapply(strsplit(df$Name, "\\s+"), function(x) grepl("bio", tolower(x[1]))) 
                                  | grepl("^15", df$Code), ]

#                   Name   Code
#1       Biofuel is good 159403
#2 Bioecological is good 161540
#3    Probiotics is good 159883

或者在R底也一样
df[sapply(strsplit(df$Name, "\\s+"), function(x) any(grepl("^bio", tolower(x)))) | 
                                                 grepl("^15", df$Code), ]


原始答案
这将选择Name
的第一个单词中出现“bio”的行（word（Name）
仅返回第一个单词）或以“15”开头的code


使用相同的逻辑，但以R为基数，我们可以
df %>%
  filter(str_detect(str_to_lower(Name), "\\bbio") | str_detect(Code, "^15"))

df[sapply(strsplit(df$Name, "\\s+"), function(x) grepl("bio", tolower(x[1]))) 
                                  | grepl("^15", df$Code), ]

#                   Name   Code
#1       Biofuel is good 159403
#2 Bioecological is good 161540
#3    Probiotics is good 159883

在这里，它在空白处拆分字符串，然后从每个字符串中提取第一个单词（x[1]
），并检查其中是否有“bio”或获取以“15”开头的行。
编辑
如果我们想选择任何以“bio”开头的单词，我们可以这样做
df %>%
  filter(str_detect(str_to_lower(Name), "\\bbio") | str_detect(Code, "^15"))

df[sapply(strsplit(df$Name, "\\s+"), function(x) grepl("bio", tolower(x[1]))) 
                                  | grepl("^15", df$Code), ]

#                   Name   Code
#1       Biofuel is good 159403
#2 Bioecological is good 161540
#3    Probiotics is good 159883

或者在R底也一样
df[sapply(strsplit(df$Name, "\\s+"), function(x) any(grepl("^bio", tolower(x)))) | 
                                                 grepl("^15", df$Code), ]


原始答案
这将选择Name
的第一个单词中出现“bio”的行（word（Name）
仅返回第一个单词）或以“15”开头的code


使用相同的逻辑，但以R为基数，我们可以
df %>%
  filter(str_detect(str_to_lower(Name), "\\bbio") | str_detect(Code, "^15"))

df[sapply(strsplit(df$Name, "\\s+"), function(x) grepl("bio", tolower(x[1]))) 
                                  | grepl("^15", df$Code), ]

#                   Name   Code
#1       Biofuel is good 159403
#2 Bioecological is good 161540
#3    Probiotics is good 159883

在这里，它在空白处拆分字符串，然后从每个字符串中提取第一个单词（x[1]
），并检查其中是否有“bio”或获取以“15”开头的行。
我们可以使用过滤所有与任何变量一起使用
df %>% 
   filter_all(any_vars(str_detect(str_to_lower(.), "^(bio|15)")))
#                  Name   Code
#1       Biofuel is good 159403
#2 Bioecological is good 161540
#3    Probiotics is good 159883

注意：如果它是需要应用条件的列的子集，请在

如果我们需要在句子中选择任何以“Bio”开头的单词，请使用单词边界（\\b
）
数据
df我们可以使用filter\u all
和any\u vars

df %>% 
   filter_all(any_vars(str_detect(str_to_lower(.), "^(bio|15)")))
#                  Name   Code
#1       Biofuel is good 159403
#2 Bioecological is good 161540
#3    Probiotics is good 159883

注意：如果它是需要应用条件的列的子集，请在

如果我们需要在句子中选择任何以“Bio”开头的单词，请使用单词边界（\\b
）
数据
df尝试“^bio”表示您只想查看开头，因此您想筛选Name
第一个单词中包含“bio”且code
以15开头的行？@RonakShah Yes。这就是我想在第一列的第一个单词或第一列中删除的内容？尝试“^bio”表示您只想查看开头，因此您希望筛选Name
在第一个单词中包含“bio”且code
以15开头的行？@RonakShah Yes。这就是我想在第一列的第一个单词中，或者在第一列中，多谢你对grepl（）函数的进一步解释。我刚刚意识到这只会选择句子开头的单词。你知道我怎么能在句子的任何位置选择任何单词，只要它以Bio开头？e、 g这很好，因此在本例中可以使用特殊的char\b
，这是正则表达式（regex）中的单词边界。所以代码应该是过滤器（grepl（\\bbio），tolower（Name））
。您必须使用\\b
来转义反斜杠，否则它会被识别为常规反斜杠字符。感谢您对grepl（）函数的进一步解释。我刚刚意识到这只会选择句子开头的单词。你知道我怎么能在句子的任何位置选择任何单词，只要它以Bio开头？e、 g这很好，因此在本例中可以使用特殊的char\b
，这是正则表达式（regex）中的单词边界。所以代码应该是过滤器（grepl（\\bbio），tolower（Name））
。您必须使用\\b
来转义反斜杠，否则它会被识别为常规反斜杠字符。我刚刚意识到这只会选择句子开头的单词。你知道我怎么能在句子的任何位置选择任何单词，只要它以Bio开头？e、 这是很好的生物燃料–@BayuoBlaise更新了答案。我刚刚意识到这只会在文章开头挑出一些词

[regex]相关文章推荐 Regex 要匹配name1.name2[.name3]的正则表达式 regex regex删除前缀，另一个删除第一个字母的大写字母 regexperl Regex 如何在Vim的替换文本中使用匹配的文本 regexvimreplace Regex 正则表达式A（B+；C*）==AB+；A（C*） regex GNU make通配符是否能够匹配更复杂的模式，例如regexp？ regexlinux Regex 如何使用正则表达式定位[a-z]并仅替换为小写字母的大写字母我继承了一组C++文件，这些文件需要改变函数和变量名以满足我们新的C++编码标准。 regexreplace Regex R从数据帧中的列中分离出数字和单位 regexr Regex perl正则表达式保持从匹配到行尾 regex Regex 找出正则表达式是否可以匹配 regex Regex 捕获文本文件中的无效电子邮件地址 regexperlshellunixawk Regex 需要突出显示文本的特定模式 regexstringperlweb Regex 如何表达以下正则表达式：一个数字字符串只包含两个或三个数字 regex Regex 修改'sed'，从字符串中删除确切的标记 regexstringsed sed ubuntu vs mac regexp regexbashmacosubuntused regex将两个单词与字符串中的动态单词进行匹配 regex Regex 将任何0-9数字与WinActive（）匹配 regexautohotkey Regex 另一个正则表达式 regex Regex 用于提取包含起始数字但不包括下一个数字的韵文的模式 regex Regex Google脚本从单元格中获取文本（Spredsheet特殊字符，如ç；、à；、é；、è；）并在Google文档中搜索它 regexgoogle-apps-scriptgoogle-sheets Regex 正则表达式如何匹配精确的3个数字 regexgo 随机文章推荐自定义Outlook和Exchange以向会议添加新字段+；从自定义应用程序从Exchange查询，是否可能？ outlookexchange-server 如何将outlook工具栏上的按钮添加到自定义窗体？ outlook Outlook 2007如何决定是否将电子邮件重定向到垃圾邮件文件夹？ outlook Outlook 2007联系人访问警告 outlook Outlook加载项已加载但未显示 outlook 在Outlook的HTML电子邮件中使用title属性 outlook 如何在MS Outlook中插入HTML页面（带有按钮、组合等的HTML页面）或Infopath表单（.XSN文件）作为电子邮件正文的内联内容 outlookms-office Outlook 365:保持特定事件与外部系统同步 outlookoffice365office-js ical与outlook下的基本身份验证同步 outlookcalendar Office 365，outlook加载项图标未更新 outlookoffice365 无法使用Blue Prism Mapiex获取邮件功能读取Outlook发送的邮件 outlook 服务器端Outlook.Exchange规则执行脚本 outlookexchange-server Outlook 2016是否仍支持Outlook.Application组件？ outlook 如何使Outlook加载项在项目更改时不从零开始加载 outlookoffice-js VSTO outlook加载项-创建和管理可共享文件夹 outlook NVDA未在Outlook中读取带有背景图像的文本 outlook 寻求定制Exchange联系人集成，即；“玩得很好”；使用Outlook社交连接器 outlook 如何恢复Outlook附加文件和超链接？ outlookoffice365

[r]相关推荐 Tags Filter Blockchain Jakarta Ee Xmpp Swiftui Msbuild Web Services Mongodb Graphviz Z3 Webgl Haskell For Loop Automation Salesforce Zend Framework2 Dask Coding Style Http Asp.net Mvc 2 Documentation Sip Libgdx Wxpython Antlr Ruby On Rails 4 Embedded Facebook Graph Api Talend Discord Dynamic Apache Storm Ssl Linq Hibernate Erlang Download Types Time Algorithm Polymer Csv Gradle Windows Runtime Dynamics Crm Mercurial EmptyTag Next.js D Jsp Corda Model View Controller Gatsby Iis Google App Engine Architecture Wordpress Autodesk Forge Socket.io Xamarin.android Browser Cassandra Glassfish React Native Dojo Internet Explorer Reference Nosql Youtube Install4j Graphql Network Programming Cloud Cmake Virtualbox Botframework Com Spring Aframe Timer Iframe Teamcity Amazon Cloudformation User Interface Entity Framework Sorting Dom Phpstorm Vaadin Azure Cosmosdb Cryptography Robotframework Wso2 Grafana Macos Lotus Notes Osgi Go Gridview Logstash Linux Kernel Actionscript 3 Gmail Docker Compose Lucene Laravel 5 Android Studio Vim Ms Word Uitableview Mod Rewrite Drupal 7 C++ Cli Configuration Mobile Maps Ada Cmd Node.js Cron Python Sphinx Cygwin Kubernetes Wicket Plot Chart.js Mariadb Google Maps Selenium Webdriver Ms Office Soap Xamarin.ios Actions On Google Jquery Camera Isabelle Xcode C++11 Angular Nestjs Rspec Asterisk Content Management System Google Cloud Dataflow Multithreading Routing Ip Stream Codenameone Visual Studio Autocomplete Google Chrome Devtools Oracle Apex Abap Windbg Fiware Statistics Plsql Stanford Nlp Kentico Internet Explorer 8 Delphi Java Frameworks Jira Chef Infra Amp Html Post Xamarin.forms Nhibernate Arangodb Boost Neo4j Grails Azure Ad B2c Function Recursion Visual Studio 2015 Rest Android Layout Powerbi Ipad Date Asynchronous Firefox Addon Centos Colors Sqlite Interface Sharepoint 2013 Netty Asp Classic Django Laravel Streaming Twitter Bootstrap 3 Llvm Kendo Ui Cloud Foundry Opencart Mule

Copyright © 2024. All Rights Reserved by - Fatal编程技术网