Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/297.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
用Python处理web链接_Python - Fatal编程技术网

用Python处理web链接

用Python处理web链接,python,Python,我在文件中存储了一些链接,如下所示: http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag=22&id=o-AOtM1kWozUiJKP2ENWH989ZIfJaZNPVvXTrBkXx40lG5&key=yt6&ip=159.253.144.86&lmt=1480060612064057&dur=1047.870&mv=m&source=youtube&ms=au&

我在文件中存储了一些链接,如下所示:

http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag=22&id=o-AOtM1kWozUiJKP2ENWH989ZIfJaZNPVvXTrBkXx40lG5&key=yt6&ip=159.253.144.86&lmt=1480060612064057&dur=1047.870&mv=m&source=youtube&ms=au&ei=DtN8WLfwFsKb1gKXho6YDw&expire=1484597102&mn=sn-p5qlsnss&mm=31&ipbits=0&nh=IgpwcjAzLmlhZDA3KgkxMjcuMC4wLjE&initcwndbps=4717500&mt=1484575249&pl=24&signature=1ECAB2B56C30CBF760721A1A26A7E80963DB36B8.6336B2C9C41DB53C8FA1D2A037793275F57C4825&ratebypass=yes&mime=video%2Fmp4&upn=tUcEt34Qe6c&sparams=dur%2Cei%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cnh%2Cpl%2Cratebypass%2Csource%2Cupn%2Cexpire&title=600ft+UFO+Crash+Site+Discovered+On+Mars%21+11%2F23%2F16
在链接的最后,我们有视频的标题。我想从文件中读取此链接,并以正确的格式获取视频标题(正确解析“+”和“%”符号)。我该怎么做

我不能使用原始cgi,因为链接是从文件读取的,而不是通过表单提交的。有什么想法吗?

您可以使用以下字符串:

In [17]: urllib.parse.parse_qs(s)
Out[17]: 
{'dur': ['1047.870'],
 'ei': ['DtN8WLfwFsKb1gKXho6YDw'],
 'expire': ['1484597102'],
 'http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag': ['22'],
[.. and so on ..]
 'source': ['youtube'],
 'sparams': ['dur,ei,id,initcwndbps,ip,ipbits,itag,lmt,mime,mm,mn,ms,mv,nh,pl,ratebypass,source,upn,expire'],
 'title': ['600ft UFO Crash Site Discovered On Mars! 11/23/16'],
 'upn': ['tUcEt34Qe6c']}

In [18]: urllib.parse.parse_qs(s)["title"][0]
Out[18]: '600ft UFO Crash Site Discovered On Mars! 11/23/16'
您可以使用并为其指定字符串:

In [17]: urllib.parse.parse_qs(s)
Out[17]: 
{'dur': ['1047.870'],
 'ei': ['DtN8WLfwFsKb1gKXho6YDw'],
 'expire': ['1484597102'],
 'http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag': ['22'],
[.. and so on ..]
 'source': ['youtube'],
 'sparams': ['dur,ei,id,initcwndbps,ip,ipbits,itag,lmt,mime,mm,mn,ms,mv,nh,pl,ratebypass,source,upn,expire'],
 'title': ['600ft UFO Crash Site Discovered On Mars! 11/23/16'],
 'upn': ['tUcEt34Qe6c']}

In [18]: urllib.parse.parse_qs(s)["title"][0]
Out[18]: '600ft UFO Crash Site Discovered On Mars! 11/23/16'

Python3提供了超级方便的,但是如果您使用的是Python2,您可能需要首先挖掘标题字符串

import urllib

url = 'http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag=22&id=o-AOtM1kWozUiJKP2ENWH989ZIfJaZNPVvXTrBkXx40lG5&key=yt6&ip=159.253.144.86&lmt=1480060612064057&dur=1047.870&mv=m&source=youtube&ms=au&ei=DtN8WLfwFsKb1gKXho6YDw&expire=1484597102&mn=sn-p5qlsnss&mm=31&ipbits=0&nh=IgpwcjAzLmlhZDA3KgkxMjcuMC4wLjE&initcwndbps=4717500&mt=1484575249&pl=24&signature=1ECAB2B56C30CBF760721A1A26A7E80963DB36B8.6336B2C9C41DB53C8FA1D2A037793275F57C4825&ratebypass=yes&mime=video%2Fmp4&upn=tUcEt34Qe6c&sparams=dur%2Cei%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cnh%2Cpl%2Cratebypass%2Csource%2Cupn%2Cexpire&title=600ft+UFO+Crash+Site+Discovered+On+Mars%21+11%2F23%2F16'

title = url[url.rfind('&title=') + 7:]
print urllib.unquote_plus(title)
注意:感谢您指出,python 2中也提供了,所以只需:

import urlparse
print urlparse.parse_qs(url)['title'][0]

'600ft UFO Crash Site Discovered On Mars! 11/23/16'

Python3提供了超级方便的,但是如果您使用的是Python2,您可能需要首先挖掘标题字符串

import urllib

url = 'http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag=22&id=o-AOtM1kWozUiJKP2ENWH989ZIfJaZNPVvXTrBkXx40lG5&key=yt6&ip=159.253.144.86&lmt=1480060612064057&dur=1047.870&mv=m&source=youtube&ms=au&ei=DtN8WLfwFsKb1gKXho6YDw&expire=1484597102&mn=sn-p5qlsnss&mm=31&ipbits=0&nh=IgpwcjAzLmlhZDA3KgkxMjcuMC4wLjE&initcwndbps=4717500&mt=1484575249&pl=24&signature=1ECAB2B56C30CBF760721A1A26A7E80963DB36B8.6336B2C9C41DB53C8FA1D2A037793275F57C4825&ratebypass=yes&mime=video%2Fmp4&upn=tUcEt34Qe6c&sparams=dur%2Cei%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cnh%2Cpl%2Cratebypass%2Csource%2Cupn%2Cexpire&title=600ft+UFO+Crash+Site+Discovered+On+Mars%21+11%2F23%2F16'

title = url[url.rfind('&title=') + 7:]
print urllib.unquote_plus(title)
注意:感谢您指出,python 2中也提供了,所以只需:

import urlparse
print urlparse.parse_qs(url)['title'][0]

'600ft UFO Crash Site Discovered On Mars! 11/23/16'

使用
urlparse.parse\u qs

try:
    from urlparse import urlparse           # for python2
except:
    from urllib import parse as urlparse    # for python3

rv = urlparse.parse_qs(link)
title = rv['title'][0]

使用
urlparse.parse\u qs

try:
    from urlparse import urlparse           # for python2
except:
    from urllib import parse as urlparse    # for python3

rv = urlparse.parse_qs(link)
title = rv['title'][0]
可以满足您的需要:

import purl

u = purl.URL('http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag=22&id=o-AOtM1kWozUiJKP2ENWH989ZIfJaZNPVvXTrBkXx40lG5&key=yt6&ip=159.253.144.86&lmt=1480060612064057&dur=1047.870&mv=m&source=youtube&ms=au&ei=DtN8WLfwFsKb1gKXho6YDw&expire=1484597102&mn=sn-p5qlsnss&mm=31&ipbits=0&nh=IgpwcjAzLmlhZDA3KgkxMjcuMC4wLjE&initcwndbps=4717500&mt=1484575249&pl=24&signature=1ECAB2B56C30CBF760721A1A26A7E80963DB36B8.6336B2C9C41DB53C8FA1D2A037793275F57C4825&ratebypass=yes&mime=video%2Fmp4&upn=tUcEt34Qe6c&sparams=dur%2Cei%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cnh%2Cpl%2Cratebypass%2Csource%2Cupn%2Cexpire&title=600ft+UFO+Crash+Site+Discovered+On+Mars%21+11%2F23%2F16')
print(u.query_param('title'))
可以满足您的需要:

import purl

u = purl.URL('http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag=22&id=o-AOtM1kWozUiJKP2ENWH989ZIfJaZNPVvXTrBkXx40lG5&key=yt6&ip=159.253.144.86&lmt=1480060612064057&dur=1047.870&mv=m&source=youtube&ms=au&ei=DtN8WLfwFsKb1gKXho6YDw&expire=1484597102&mn=sn-p5qlsnss&mm=31&ipbits=0&nh=IgpwcjAzLmlhZDA3KgkxMjcuMC4wLjE&initcwndbps=4717500&mt=1484575249&pl=24&signature=1ECAB2B56C30CBF760721A1A26A7E80963DB36B8.6336B2C9C41DB53C8FA1D2A037793275F57C4825&ratebypass=yes&mime=video%2Fmp4&upn=tUcEt34Qe6c&sparams=dur%2Cei%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cnh%2Cpl%2Cratebypass%2Csource%2Cupn%2Cexpire&title=600ft+UFO+Crash+Site+Discovered+On+Mars%21+11%2F23%2F16')
print(u.query_param('title'))
从2.6开始就有了,只是模块发生了变化。从2.6开始就有了,只是模块发生了变化。