Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/tensorflow/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将包含Unicode的字符串列转换为ascii以加载URL_Python_Pandas_Wikipedia Api_Python Unicode - Fatal编程技术网

Python 将包含Unicode的字符串列转换为ascii以加载URL

Python 将包含Unicode的字符串列转换为ascii以加载URL,python,pandas,wikipedia-api,python-unicode,Python,Pandas,Wikipedia Api,Python Unicode,我有一个pandas数据框,其中包含一个包含Wikipedia URL的列,我想加载该列。但是,某些字符串不会加载,因为它们包含Unicode。例如,“Kruskal%E2%80%93Wallis\u单向方差分析”提出了以下问题 PageError: Page id "Cauchy%E2%80%93Schwarz_inequality" does not match any pages. Try another id! 有没有办法将所有Unicode码转换成ascii码?因此,在本例

我有一个pandas数据框,其中包含一个包含Wikipedia URL的列,我想加载该列。但是,某些字符串不会加载,因为它们包含Unicode。例如,“Kruskal%E2%80%93Wallis\u单向方差分析”提出了以下问题

PageError: Page id "Cauchy%E2%80%93Schwarz_inequality" does not match any      pages. Try another id!
有没有办法将所有Unicode码转换成ascii码?因此,在本例中,我需要一个可以创建新列的函数:

old column                            new column
Cauchy%E2%80%93Schwarz_inequality     Cauchy–Schwarz_inequality
Markov%27s_inequality                 Markov's_inequality

urllib.parse.unquote
应该可以做到这一点。希望这有帮助

In [1]: import urllib
   ...: 
   ...: import pandas as pd
   ...: 
   ...: 
   ...: df = pd.DataFrame({'url': ['Markov%27s_inequality', 'Cauchy%E2%80%93Schwarz_inequality']})
   ...: df['clean_url'] = df['url'].apply(urllib.parse.unquote)
   ...: 

In [2]: df
Out[2]: 
                                 url                  clean_url
0              Markov%27s_inequality        Markov's_inequality
1  Cauchy%E2%80%93Schwarz_inequality  Cauchy–Schwarz_inequality