'；列表'；对象没有属性'；解码'；-Python错误_Python_Regex_Utf 8_Urllib

'；列表'；对象没有属性'；解码'；-Python错误

python regex utf-8

'；列表'；对象没有属性'；解码'；-Python错误,python,regex,utf-8,urllib,Python,Regex,Utf 8,Urllib,我在解码字符串中的一些数据时遇到问题 respData = urllib.request.urlopen( 'https://www.mcdelivery.com.pk/pk/browse/menu.html') resp = respData.read() link = re.findall(r'<ul class="secondary-menu">(.*?)</ul>', str(resp)) # URLS Urls = re.findall("href=

我在解码字符串中的一些数据时遇到问题

respData = urllib.request.urlopen(
    'https://www.mcdelivery.com.pk/pk/browse/menu.html')

resp = respData.read()

link = re.findall(r'<ul class="secondary-menu">(.*?)</ul>', str(resp))
# URLS
Urls = re.findall("href=[\"\'](.*?)[\"\']", str(link))

# remove amp from the urls
Url1 = [re.sub(r'amp;', '', item) for item in Urls]
# menu
deals = re.findall(r'<span>(.*?)</span>', str(link))
print(deals)

我想改变这个

\\\xe2\\\\x98\\\\x85What's New\\\\\xe2\\\\x98\\\\x85“，“\\\\xc3\\\\x80la carte&Value Founds，McCaf\\\\xc3\\\\xa9”

至

“What's New”，“la carte&Value Founds”，“McCafe”

我相信它与.decode（'utf-8'）有关，我确实应用了我使用regex获取交易的行

deals=re.findall（r'（.*？），str（link））.decode（'uts-8'）

，但它给出了一个错误：

deals = re.findall(r'<span>(.*?)</span>', str(link)).decode('uts-8')
AttributeError: 'list' object has no attribute 'decode'

deals=re.findall（r'（*？），str（link））.decode（'uts-8'））
AttributeError:“列表”对象没有属性“解码”

摘要：

我知道我的问题可以通过解码（'utf-8'）解决，但我无法正确应用它。只需要帮助如何解码我在交易中获取的数据

字符串已编码，因此您可以看到类似

“\\\\”

，这意味着

“\”

\\\xe2\\\\x98\\\x85什么是新的\\\\xe2\\\\x98\\\\x85

实际上是

★有什么新鲜事吗★

因此，我们现在可以通过两个步骤来处理它：
str1=“”\\\\xe2\\\\x98\\\\x85什么新的\\\\xe2\\\\x98\\\\x85“，“\\\xc3\\\\x80点菜和价值餐，麦卡夫\\\\xc3\\\\\xa9”
str2=str1。替换（“\\\”，“\\”）
打印（str2）
#\xe2\x98\x85最新信息\xe2\x98\x85“，“\xc3\x80点菜和超值套餐，麦卡夫\xc3\xa9”

然后我们通过以下方式进行处理：
str1=b”“\xe2\x98\x85最新消息\xe2\x98\x85”“，“\xc3\x80点菜和超值餐，麦咖啡\xc3\xa9”“”
str2=str1.解码（“utf-8”）
打印（str2）
# ★有什么新鲜事吗★“，‘点菜和超值餐，麦卡菲’

希望对您有所帮助
您是有意将re.findall
应用于字符串化列表，还是不知道re.findall
返回列表？您必须这样做，因为您将结果强制转换为str。但是为什么？是的，我知道它返回列表，这里的问题是我正在尝试解码保存在交易中的内容。T这不是怎么做的。你需要遍历列表中的每个字符串，然后对每个字符串进行额外的搜索。你能告诉我怎么做吗？那太好了！！你将面临的另一个问题是页面通过javascript加载，因此你需要像Selenium这样的东西来获取javascript呈现的HTML代码。
deals = re.findall(r'<span>(.*?)</span>', str(link)).decode('uts-8')
AttributeError: 'list' object has no attribute 'decode'