Python 3.x 与Pandas和fuzzyfuzzy的模糊字符串匹配；KeyError:'；名称'；_Python 3.x_Pandas_Fuzzywuzzy

Python 3.x 与Pandas和fuzzyfuzzy的模糊字符串匹配；KeyError:'；名称'；

python-3.x pandas

Python 3.x 与Pandas和fuzzyfuzzy的模糊字符串匹配；KeyError:'；名称'；,python-3.x,pandas,fuzzywuzzy,Python 3.x,Pandas,Fuzzywuzzy,我有这样的数据文件- 我还有另一个数据文件，它有所有正确的国家名称。为了匹配我在下面使用的两个文件： from fuzzywuzzy import process import pandas as pd names_array=[] ratio_array=[] def match_names(wrong_names,correct_names): for row in wrong_names: x=process.extractOne(row, corre

我有这样的数据文件-

我还有另一个数据文件，它有所有正确的国家名称。为了匹配我在下面使用的两个文件：

from fuzzywuzzy import process
import pandas as pd




names_array=[]
ratio_array=[]
def match_names(wrong_names,correct_names):
    for row in wrong_names:
         x=process.extractOne(row, correct_names)
         names_array.append(x[0])
         ratio_array.append(x[1])
    return names_array,ratio_array



#Wrong country names dataset
df=pd.read_csv("wrong-country-names.csv",encoding="ISO-8859-1")
wrong_names=df['name'].dropna().values

#Correct country names dataset
choices_df=pd.read_csv("country-names.csv",encoding="ISO-8859-1")
correct_names=choices_df['name'].values

name_match,ratio_match=match_names(wrong_names,correct_names)



df['correct_country_name']=pd.Series(name_match)
df['country_names_ratio']=pd.Series(ratio_match)

df.to_csv("string_matched_country_names.csv")

print(df[['name','correct_country_name','country_names_ratio']].head(10))

我得到以下错误：

runfile('C:/Users/Drashti Bhatt/Desktop/untitled0.py', wdir='C:/Users/Drashti Bhatt/Desktop')
Traceback (most recent call last):

  File "<ipython-input-155-a1fd87d9f661>", line 1, in <module>
    runfile('C:/Users/Drashti Bhatt/Desktop/untitled0.py', wdir='C:/Users/Drashti Bhatt/Desktop')

  File "C:\Users\Drashti Bhatt\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
    execfile(filename, namespace)

  File "C:\Users\Drashti Bhatt\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/Drashti Bhatt/Desktop/untitled0.py", line 17, in <module>
    wrong_names=df['name'].dropna().values

  File "C:\Users\Drashti Bhatt\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2927, in __getitem__
    indexer = self.columns.get_loc(key)

  File "C:\Users\Drashti Bhatt\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2659, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))

  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item

      File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item

    KeyError: 'name'

runfile（'C:/Users/Drashti-Bhatt/Desktop/untitled0.py'，wdir='C:/Users/Drashti-Bhatt/Desktop'）
回溯（最近一次呼叫最后一次）：
文件“”，第1行，在
runfile（'C:/Users/Drashti-Bhatt/Desktop/untitled0.py'，wdir='C:/Users/Drashti-Bhatt/Desktop'）
文件“C:\Users\Drashti Bhatt\Anaconda3\lib\site packages\spyder\u kernels\customize\spyderrcustomize.py”，第827行，在runfile中
execfile（文件名、命名空间）
文件“C:\Users\Drashti Bhatt\Anaconda3\lib\site packages\spyder\u kernels\customize\spyderrcustomize.py”，第110行，在execfile中
exec（编译（f.read（），文件名，'exec'），命名空间）
文件“C:/Users/Drashti Bhatt/Desktop/untitled0.py”，第17行，在
错误的_name=df['name'].dropna（）值
文件“C:\Users\Drashti Bhatt\Anaconda3\lib\site packages\pandas\core\frame.py”，第2927行，在__
indexer=self.columns.get_loc（键）
文件“C:\Users\Drashti Bhatt\Anaconda3\lib\site packages\pandas\core\index\base.py”，第2659行，在get_loc中
返回self.\u引擎。获取\u loc（self.\u可能\u cast\u索引器（键））
文件“pandas/_libs/index.pyx”，第108行，在pandas._libs.index.IndexEngine.get_loc中
文件“pandas/_libs/index.pyx”，第132行，在pandas._libs.index.IndexEngine.get_loc中
文件“pandas/_libs/hashtable_class_helper.pxi”，第1601行，在pandas._libs.hashtable.PyObjectHashTable.get_项中
文件“pandas/_libs/hashtable_class_helper.pxi”，第1608行，在pandas._libs.hashtable.PyObjectHashTable.get_项中
KeyError:“名称”

在此方面的任何帮助都将不胜感激！多谢

您的代码包含例如

错误的\u names=df['name'].dropna（）.值
在回溯中提到的“冒犯”一行）
现在请看显示数据帧的图片：

它不包含name
列
它包含Country
列

返回到回溯：最后出现错误消息：
键错误：“名称”

因此，您尝试访问一个不存在的列
我还注意到另一个细节：values属性包含底层
Numpy数组，而process.extractOne需要“普通”Python列表（字符串），以执行匹配
因此，您可能应该将上述说明更改为：
wrong_names=df['Country'].dropna().values.tolist()

另一条指令也是如此。
是否有错误的国家名称。csv
有一列名称？错误的名称=df['name'].dropna（）。值
您的数据中没有'name'列，我想您需要国家列