Python 无法从打印中删除转义字符

Python 无法从打印中删除转义字符,python,escaping,beautifulsoup,html-escape-characters,Python,Escaping,Beautifulsoup,Html Escape Characters,嗨,我正试图提取信息放入包含纯文本的列表中,但找不到删除转义字符的方法 一般来说,我对python和编程非常陌生。我一直在试图解决这个问题,但找不到一个 这是我的代码: import urllib import re from bs4 import BeautifulSoup x=1 while x<2: url = "http://search.insing.com/ts/food-drink/bars-pubs/bars-pubs?page=" +str(x) h

嗨,我正试图提取信息放入包含纯文本的列表中,但找不到删除转义字符的方法

一般来说,我对python和编程非常陌生。我一直在试图解决这个问题,但找不到一个

这是我的代码:

import urllib
import re
from bs4 import BeautifulSoup


x=1
while x<2:

    url = "http://search.insing.com/ts/food-drink/bars-pubs/bars-pubs?page=" +str(x)
    htmlfile = urllib.urlopen(url).read()
    soup = BeautifulSoup(htmlfile.decode('utf-8','ignore'))
    reshtml = soup.find("div", "results").find_all("h3")
    reslist = []
    for item in reshtml:
            res = item.get_text()
            reslist.append(res)

    print reslist
    x += 1
导入urllib
进口稀土
从bs4导入BeautifulSoup
x=1

当x时,电流输出如下所示:

[u'\n\r\n                Parco Caffe\n', 
 u'\n\r\n                AdstraGold Microbrewery & Bistro Bar\n', 
 u'\n\r\n                Alkaff Mansion Ristorante\n', 
 u'\n\r\n                The Fat Cat Bistro\n', 
 u'\n\r\n                Gravity Bar\n', 
 u'\n\r\n                The Wine Company\r\n                    (Evans Road)\r\n                \n', 
 u'\n\r\n                Serenity Spanish Bar & Restaurant\r\n                    (VivoCity)\r\n                \n', 
 u'\n\r\n                The New Harbour Cafe & Bar\n', 
 u'\n\r\n                Indian Times\n', 
 u'\n\r\n                Sunset Bay Beach Bar\n', 
 u'\n\r\n                Friends @ Jelita\n', 
 u'\n\r\n                Talk Cock Sing Song @ Thomson\n', 
 u'\n\r\n                En Japanese Dining Bar\r\n                    (UE Square)\r\n                \n', 
 u'\n\r\n                Magma German Wine Bistro\n', 
 u"\n\r\n                Tam Kah Shark's Fin\n", 
 u'\n\r\n                Senso Ristorante & Bar\n', 
 u'\n\r\n                Hard Rock Cafe\r\n                    (HPL House)\r\n                \n', 
 u'\n\r\n                St. James Power Station \n', 
 u'\n\r\n                The St. James\n', 
 u'\n\r\n                Brotzeit German Bier Bar & Restaurant\r\n                    (Vivocity)\r\n                \n']
在打印之前添加以下行:

reslist = [y.replace('\n','').replace('\r','') for y in reslist]
reslist = [y.strip() for y in reslist]
给我这个输出:

[u'Alkaff Mansion Ristorante', 
 u'Parco Caffe', 
 u'AdstraGold Microbrewery & Bistro Bar', 
 u'Gravity Bar', 
 u'The Fat Cat Bistro', 
 u'The Wine Company                    (Evans Road)', 
 u'Serenity Spanish Bar & Restaurant                    (VivoCity)', 
 u'The New Harbour Cafe & Bar', 
 u'Indian Times', 
 u'Sunset Bay Beach Bar', 
 u'Friends @ Jelita', 
 u'Talk Cock Sing Song @ Thomson', 
 u'En Japanese Dining Bar                    (UE Square)', 
 u'Magma German Wine Bistro', 
 u"Tam Kah Shark's Fin", 
 u'Senso Ristorante & Bar', 
 u'Hard Rock Cafe                    (HPL House)', 
 u'St. James Power Station', 
 u'The St. James', 
 u'Brotzeit German Bier Bar & Restaurant                    (Vivocity)']
这就是你要找的吗


<> Py的回答要好得多,而且更具体的汤。

好像你真的在跟踪锚文本,考虑改变< /P>

reshtml = soup.find("div", "results").find_all("h3")
致:

也改变了:

reslist.append(res)
致:

以下是我换衣服后得到的:

[u'Parco Caffe', u'AdstraGold Microbrewery & Bistro Bar', 
 u'Alkaff Mansion Ristorante', u'The Fat Cat Bistro', u'Gravity Bar', 
 u'The Wine Company (Evans Road)', u'Serenity Spanish Bar & Restaurant (VivoCity)', 
 u'The New Harbour Cafe & Bar', u'Indian Times', u'Sunset Bay Beach Bar',  
 u'Friends @ Jelita', u'Talk Cock Sing Song @ Thomson',  
 u'En Japanese Dining Bar (UE Square)', u'Magma German Wine Bistro',  
 u"Tam Kah Shark's Fin", u'Senso Ristorante & Bar',  
 u'Hard Rock Cafe (HPL House)', u'St. James Power Station',  
 u'The St. James', u'Brotzeit German Bier Bar & Restaurant (Vivocity)']

你的预期产出与实际产出相比是多少?而且,我不明白这个
x=1;虽然这很好。谢谢
reslist.append(' '.join(res.split()))
[u'Parco Caffe', u'AdstraGold Microbrewery & Bistro Bar', 
 u'Alkaff Mansion Ristorante', u'The Fat Cat Bistro', u'Gravity Bar', 
 u'The Wine Company (Evans Road)', u'Serenity Spanish Bar & Restaurant (VivoCity)', 
 u'The New Harbour Cafe & Bar', u'Indian Times', u'Sunset Bay Beach Bar',  
 u'Friends @ Jelita', u'Talk Cock Sing Song @ Thomson',  
 u'En Japanese Dining Bar (UE Square)', u'Magma German Wine Bistro',  
 u"Tam Kah Shark's Fin", u'Senso Ristorante & Bar',  
 u'Hard Rock Cafe (HPL House)', u'St. James Power Station',  
 u'The St. James', u'Brotzeit German Bier Bar & Restaurant (Vivocity)']