Python 匹配相近的字符/单词_Python_R - Fatal编程技术网

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/73.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 匹配相近的字符/单词_Python_R - Fatal编程技术网

Python 匹配相近的字符/单词

python r

Python 匹配相近的字符/单词,python,r,Python,R,我有以下数据框，其中包含X列和Y列 X Y 1 SAN DIEGO FOND DU LAC 2 THE RIO GRANDE RIO GRANDE 3 RIO GRANDE RIO GRANDE 4 WEST TENNESSEE

我有以下数据框，其中包含X列和Y列

    X                                   Y
1   SAN DIEGO                           FOND DU LAC
2   THE RIO GRANDE                      RIO GRANDE
3   RIO GRANDE                          RIO GRANDE
4   WEST TENNESSEE                      TENNESSEE
5   EP De SAN JOAQUIN                   De SAN JOAQUIN
6   SOUTHERN VIRGINIA                   VIRGINIA
7   SOUTHERN VIRGINIA                   SOUTHWESTERN VIRGINIA
8   EN COLOMBIA                         COLOMBIA
9   THE EP De NORTHERN CALIFORNIA       De NORTHERN CALIFORNIA
10  FLORIDA                             NEW JERSY

我想得到不匹配的行，1和10。第2-9行是匹配项或接近匹配项，可以。我期望的数据帧是

    X                                   Y
1   SAN DIEGO                           FOND DU LAC
10  FLORIDA                             NEW JERSY

在

中，我们在每列中按空格分割字符串，检查单词之间是否存在任何

相交

，找到

列表的长度
，并将长度为0的数据集子集
df1[!lengths(Map(intersect, strsplit(df1$X, "\\s+"), strsplit(df1$Y, "\\s+"))),]
#          X           Y
#1  SAN DIEGO FOND DU LAC
#10   FLORIDA   NEW JERSY


我们也可以循环遍历列，执行split

df1[!lengths(do.call(Map, c(intersect, unname(lapply(df1, strsplit, split="\\s+"))))),]
#      X           Y
#1  SAN DIEGO FOND DU LAC
#10   FLORIDA   NEW JERSY


或者另一个选项是stringdist

library(stringdist)
i1 <- with(df1, stringdist(X, Y, method = "qgram"))
df1[i1 %in% tail(sort(i1), 2),]
#          X           Y
#1  SAN DIEGO FOND DU LAC
#10   FLORIDA   NEW JERSY

库（stringdist）
i1




[r]相关文章推荐



                                                        
                                       





随机文章推荐



                                                        
Editor 触摸打字软件推荐
editor 
Editor “textfield”类型的GridPanel单元格编辑
editorextjs4 
Editor 记事本++；：阻止注释无效
editornotepad++ 
Editor SAP编辑器（6.0）中关键字是否可以自动大写？
editorsap 
Editor 不允许使用图中的一个操作数
editoradobe 
Editor 在Visual Studio代码中扩散文件的快捷键？
editorvisual-studio-code 
Editor 隐藏缩进指南/ace编辑器
editor


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
Python |效率和性能
									Python
							 									Optimization
							 
在python中的指定字符之间提取字符串
									Python
							 									Regex
							 
Python Mako模板取决于对象类？
									Python
							 
Python &引用；注释“；带有模型函数返回的查询集
									Python
							 									Django
							 
Python中的链接运算符函数'；s过滤器
									Python
							 									Filter
							 
Python TkMessageBox问题不起作用！
									Python
							 									Tkinter
							 
在Python中使用Matplotlib进行三维打印并不断获取类型错误：不支持的操作数类型为-：'；列表'；和'；列表'；
试图用MatPultLB做一个3D绘图，但是由于某种原因，当我尝试设置席，意并不断得到以下消息时，我的代码失败了：
xi = np.linspace(min(x_mtx), max(x_mtx))
  File "C:\Python27\lib\site-packages\numpy\core\function_base.py", lin
									Python
							 									3d
							 									Plot
							 									Matplotlib
							 
Python BeautifulSoup-从页面获取内部链接
									Python
							 									Web Crawler
							 
Python Web.py db.insert（）不插入行
									Python
							 									Mysql
							 
Python 将变量与列表匹配并保存匹配项
									Python
							 
Python 遍历字典名称列表
									Python
							 									Dictionary
							 
Python á；没有被取代，但是&#x；做
									Python
							 									Python 2.7
							 
Python递归-布尔
									Python
							 									Python 3.x
							 									Recursion
							 
Python pygame：替换物品/敌人的图像
									Python
							 									Image
							 
Python 两张表格在同一页上提交。在提交时验证仅适用于最后一个
									Python
							 									Flask
							 
Python 使用正则表达式获取NACE代码
									Python
							 									Regex
							 
Python 如何删除字符串的左侧部分？
									Python
							 									String
							 
Python 如何从文件路径中删除子目录名
									Python
							 
Python 在dict中查找项目
									Python
							 									Dictionary
							 
Python 在Tensorflow中访问张量中的条件索引
									Python
							 									Tensorflow
							 
Python 使由pip安装的程序（aws cli）可供所有用户访问
									Python
							 									Ubuntu
							 									Pip
							 
如何使用Python正则表达式在HTML脚本中检索javascript变量？

函数foo（）{
变量条='thisisvalue'；
}
									Python
							 
Python 使用“在循环中打印”会减慢循环速度
									Python
							 									Python 3.x
							 									Time
							 
Python 超过LeetCode的时间限制'；s硬币兑换问题
									Python
							 									Algorithm
							 
Python 传递了一个对象而不是对象'；谁的财产？
									Python
							 									Class
							 									Oop
							 
Python 在Django中尝试将数据从表单传递到数据库时，我遇到了NoReverseMatch错误
									Python
							 									Django
							 
Python 如何提取/剪切模型分类的部分图像？
									Python
							 									Tensorflow
							 									Machine Learning
							 									Keras
							 									Deep Learning
							 
Python Seaborn绘图在同一脚本中绘制多个绘图时添加图例
									Python
							 									Pandas
							 
Python 是否有可能以矢量化的方式避免igraph的返回边？
									Python
							 									Numpy
							 
Python 气流上S3KeySensor中的多个文件路径
									Python
							 									Airflow
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Properties
Ssl
Liferay
Mqtt
Wxpython
Database Design
Automation
Osgi
Design Patterns
Swift3
Wolfram Mathematica
Internet Explorer
Php
Unity3d
Sed
Three.js
Embedded
Configuration
Testng
Apache Camel
Stm32
Jetty
Drupal 7
Iframe
Asp.net Mvc 2
Exchange Server
Syntax
Outlook
Imagemagick
Ssas
Exception
Odata
Jenkins
Nhibernate
Webrtc
Bison
Nativescript
Wso2
Google Maps
Hive
Pine Script
Deployment
Notifications
Dictionary
Google Chrome Devtools
Svg
Perl
Weblogic
Protocol Buffers
Parse Platform
Logstash
Sharepoint
Wcf
Omnet++
Methods
Nlp
Tree
Ffmpeg
Hyperlink
Coffeescript
Marklogic
Mfc
Pdf
Data Structures
Xamarin.ios
Visual Studio 2015
Netsuite
Titanium
Google Maps Api 3
Hash
Telegram
Dojo
Apache Pig
Curl
Grafana
Fortran
User Interface
Azure
Google Cloud Firestore
Grep
Shiny
Mpi
Geometry
Umbraco
Google Chrome Extension
Google Cloud Dataflow
Html
Pytorch
Terraform
Amazon Dynamodb
Jupyter Notebook
Fullcalendar
Rspec
Jmeter
Snmp
Reporting Services
Http
Reference
Mdx
Boost
Tfs
Binding
Openstack
Udp
Jqgrid
Jakarta Ee
Devexpress
Plugins
Rest
Protractor
Mono
C
Ios5
Ide
Xpath
Sql Server 2012
Multithreading
Jersey
Orm
Menu
File
Asp.net Mvc 3
Vba
Pandas
Powershell
Sql Server
Filesystems
Google Drive Api
Templates
Visual Studio 2013
Gtk
Struct
Scripting
Webstorm
Vb.net
File Io
Google App Maker
Inno Setup
Clearcase
Postgresql
Doctrine Orm
Jdbc
Oracle Apex
Functional Programming
Clojure
Activerecord
Sequelize.js
Shopify
Gnuplot
Svn
Mapreduce
Streaming
Blazor
Isabelle
Sublimetext2
Windows
Amazon S3
Material Ui
Plone
Visual Studio 2010
Iis
Soap
Groovy
Spotify
Nosql
Printing
Ftp
Version Control
Prestashop
Jekyll
Jquery Plugins
Websocket
Firefox Addon
Install4j
Graph
Variables
Jboss
Compiler Errors
Magento
Aurelia
Ubuntu
Jira
Spring Integration
Entity Framework 4
Loops
Sql Server 2008 R2
Azure Service Fabric
Calendar
Django Rest Framework
Silverlight 4.0
Serial Port
If Statement
Neo4j
Continuous Integration
Bazel
Exception Handling
Activemq
Windows 7
Pointers
Actionscript
Random


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网