Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 正在删除\r\n和空白_Python_Python 2.7_Beautifulsoup_Kodi - Fatal编程技术网

Python 正在删除\r\n和空白

Python 正在删除\r\n和空白,python,python-2.7,beautifulsoup,kodi,Python,Python 2.7,Beautifulsoup,Kodi,如何使用BS和Python从打印文本中删除所有空行? 我还是新手,我想我说的可能叫做空白 电流输出: 02:00 - 05:00 NHL: Columbus Blue Jackets at San Jose Sharks - Channel 60 02:30 - 04:30 NCAAB: Quinnipiac vs Fairfield - Channel 04 03:00 - 05:00 MLS: Portland Timbers at Los Angeles

如何使用BS和Python从打印文本中删除所有空行? 我还是新手,我想我说的可能叫做空白

电流输出:

02:00 - 05:00 NHL: Columbus Blue Jackets at San Jose Sharks

 - Channel 60







02:30 - 04:30 NCAAB: Quinnipiac vs Fairfield

 - Channel 04







03:00 - 05:00 MLS: Portland Timbers at Los Angeles Galaxy

 - Channel 05
期望输出:

02:00 - 05:00 NHL: Columbus Blue Jackets at San Jose Sharks - Channel 60
02:30 - 04:30 NCAAB: Quinnipiac vs Fairfield - Channel 04 
03:00 - 05:00 MLS: Portland Timbers at Los Angeles Galaxy - Channel 05
代码:

import urllib, urllib2, re, HTMLParser, os
from bs4 import BeautifulSoup
import os

pg_source = ''
req = urllib2.Request('http://rushmore.tv/schedule')
req.add_header('User-Agent', 'Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36')

try:
    response = urllib2.urlopen(req)
    pg_source = response.read().decode('utf-8' , 'ignore')
    response.close()
except:
    pass

content = []
soup = BeautifulSoup(pg_source)
content = BeautifulSoup(soup.find('ul', { 'id' : 'myUL' }).prettify())

print (content.text)
只需一点时间,您就可以构建如下输出:

代码: 测试代码: 结果:
实现相同结果但代码更少的一种非常简单的方法是使用requests模块

这是代码

import requests
from bs4 import BeautifulSoup

html = requests.get('http://rushmore.tv/schedule').text

soup = BeautifulSoup(html,'lxml')

ul = soup.find('ul', { 'id' : 'myUL' })

for content in ul.find_all('li'):
    print(content.text)

试试这个。这对我来说很好。

将strip()函数与string一起使用您真是太好了,这将非常完美,但Kodi不使用lxml(此脚本将在Kodi中使用)。我在读ElementTree。你知道我能不能用它代替lxml?谢谢。好的,所以我改用html.parser。我现在面临一个关于Kodi的问题,我现在需要解决。这对我来说是可行的,但是有替代打印方法的方法吗?当我打印时,我不使用打印功能。我正在使用xbmc.gui,要使用它打印,我必须打印字符串。
print
只是一个接受字符串的函数。你可以用这个字符串做任何你想做的事情。但是这个字符串上的打印包括“('\n'.join('.join(l)代表zip中的l(text[::2],text[1::2]))”我怎么能把它改成更简单的“print(string)”呢?我不会使用打印函数,因为我将在kodi上使用这个脚本,并且需要提供一个简单的字符串。抱歉我的无知,我还是Python新手。谢谢。打印函数的
()
之间的所有内容都会生成一个字符串。您可以简单地将其分配给变量。例如:
mystring=“\n”.joi….
import urllib, urllib2, re, HTMLParser, os
from bs4 import BeautifulSoup
import os

pg_source = ''
req = urllib2.Request('http://rushmore.tv/schedule')
req.add_header('User-Agent',
               'Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 '
               '(KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36')

try:
    response = urllib2.urlopen(req)
    pg_source = response.read().decode('utf-8', 'ignore')
    response.close()
except:
    pass

content = []
soup = BeautifulSoup(pg_source)
content = BeautifulSoup(soup.find('ul', {'id': 'myUL'}).prettify())

text = [l.strip() for l in content.text.split('\n') if l.strip()]
print('\n'.join(' '.join(l) for l in zip(text[::2], text[1::2])))
21:00 - 23:00 NCAAB:    Pepperdine vs Saint Mary's - Channel 03
21:30 - 00:00 AFL: Gold Coast vs. Geelong - Channel 47
22:00 - 00:00 A-League: Western Sydney Wanderers vs Perth Glory - BT Sport 1
22:45 - 03:00 Ski Classic: Mora - Channel 93
23:00 - 00:30 Freestyle Skiing WC: Ski Cross - Channel 106
import requests
from bs4 import BeautifulSoup

html = requests.get('http://rushmore.tv/schedule').text

soup = BeautifulSoup(html,'lxml')

ul = soup.find('ul', { 'id' : 'myUL' })

for content in ul.find_all('li'):
    print(content.text)