使用python从拼错的村庄名称列表中查找正确的村庄名称
我有两个数据帧df1和df2。df1包含正确的村庄名称,而df2包含错误/拼写错误的村庄名称。现在我想找出与那些拼写错误的村庄名称相对应的正确村庄名称。由于我对Python非常陌生,请在这方面指导我 好吧,朋友,你没有提供你的代码,所以我自己假设 您可以通过我的示例代码理解 根据您的问题,我建议您使用使用python从拼错的村庄名称列表中查找正确的村庄名称,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我有两个数据帧df1和df2。df1包含正确的村庄名称,而df2包含错误/拼写错误的村庄名称。现在我想找出与那些拼写错误的村庄名称相对应的正确村庄名称。由于我对Python非常陌生,请在这方面指导我 好吧,朋友,你没有提供你的代码,所以我自己假设 您可以通过我的示例代码理解 根据您的问题,我建议您使用fuzzyfuzzy 您可以通过cmd安装 pip安装fuzzyfuzzy from fuzzywuzzy import process # As I don't know your column
fuzzyfuzzy
您可以通过cmd安装
pip安装fuzzyfuzzy
from fuzzywuzzy import process
# As I don't know your column name I'm assuming it on my own
df1 = {}
df2 = {}
df1['correct_name'] = ['jaipur','mumbai','ajmer','goa','sikkim']
df2['wrong_name'] = ['jepuor','mumbayi','amer','ga','goa','gooa','skim','jpur','moombi']
#You can customize and use accordingly
for items in df2['wrong_name']:
found = process.extractOne(items,df1['correct_name'])
print(items,' found similar to ',
found[0],
' with percentage ',
found[1])
我的输出是
jepuor found similar to jaipur with percentage 67
mumbayi found similar to mumbai with percentage 92
amer found similar to ajmer with percentage 89
ga found similar to goa with percentage 80
goa found similar to goa with percentage 100
gooa found similar to goa with percentage 86
skim found similar to sikkim with percentage 80
jpur found similar to jaipur with percentage 80
moombi found similar to mumbai with percentage 67
您可以在上阅读有关此模块的内容,显示您的实际数据和代码,或者您已经拥有或不太可能获得任何帮助。您确实应该共享您的代码。按字母顺序排序可能是一个好主意。这个解决方案可行,但性能非常差。特别是你不应该多次调用
process.extractOne
。谢谢@rish_-hyun现在我至少可以朝一个方向前进了。@maxbachmann我同意,我现在就编辑