Python 刮谷歌学者安全页面_Python_Google Scholar

Python 刮谷歌学者安全页面

python

Python 刮谷歌学者安全页面,python,google-scholar,Python,Google Scholar,我有这样一个字符串： url = 'http://scholar.google.pl/citations?view_op\x3dsearch_authors\x26hl\x3dpl\x26oe\x3dLatin2\x26mauthors\x3dlabel:security\x26after_author\x3drukAAOJ8__8J\x26astart\x3d10' 我想将其转换为： converted_url = 'https://scholar.google.pl/citations?v

我有这样一个字符串：

url = 'http://scholar.google.pl/citations?view_op\x3dsearch_authors\x26hl\x3dpl\x26oe\x3dLatin2\x26mauthors\x3dlabel:security\x26after_author\x3drukAAOJ8__8J\x26astart\x3d10'

我想将其转换为：

converted_url = 'https://scholar.google.pl/citations?view_op=search_authors&hl=en&mauthors=label:security&after_author=rukAAOJ8__8J&astart=10'

我试过这个：

converted_url = url.decode('utf-8')

但是，会引发以下错误：

AttributeError: 'str' object has no attribute 'decode'

解码

用于将

字节

转换为

字符串

。您的url是

字符串

，而不是

字节

您可以使用

encode

将此

字符串

转换为

字节

，然后使用

解码

转换为正确的

字符串

（我使用prefix

来模拟有此问题的文本-没有前缀的url不需要转换）

结果:

http://scholar.google.pl/citations?view_op\x3dsearch_authors\x26hl\x3dpl\x26oe\x3dLatin2\x26mauthors\x3dlabel:security\x26after_author\x3drukAAOJ8__8J\x26astart\x3d10

http://scholar.google.pl/citations?view_op=search_authors&hl=pl&oe=Latin2&mauthors=label:security&after_author=rukAAOJ8__8J&astart=10

顺便说一句：首先检查

打印（url）

也许您有正确的url，但您使用了错误的方法来显示它。Python Shell使用

print（repr（））

显示所有结果，而不使用

print（）

，它将一些字符显示为代码，以显示文本中使用的尾端编码（utf-8、iso-8859-1、win-1250、latin-1等）

BTW:

http://scholar.google.pl/citations?view_op\x3dsearch_authors\x26hl\x3dpl\x26oe\x3dLatin2\x26mauthors\x3dlabel:security\x26after_author\x3drukAAOJ8__8J\x26astart\x3d10

http://scholar.google.pl/citations?view_op=search_authors&hl=pl&oe=Latin2&mauthors=label:security&after_author=rukAAOJ8__8J&astart=10