Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/299.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用bs4查找链接_Python_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 使用bs4查找链接

Python 使用bs4查找链接,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,我正在尝试使用bs4从脚本标记获取链接 这是我想从中删除链接的标签 html = """<script type="text/javascript">var player = new Clappr.Player({ sources: ["https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4","https://example.com/zx5x4vxkb52d

我正在尝试使用bs4从脚本标记获取链接

这是我想从中删除链接的标签

html = """<script type="text/javascript">var player = new Clappr.Player({
    sources: ["https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4","https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4"]

    poster: "image.jpg",
    width: "100%",
height: "100%",
disableVideoTagContextMenu: true,
    parentId: "#vplayer",
    events: {
    onReady: function() {  },
    }"""
链接匹配,所以我只需要其中一个

注: doamin名称每次都会更改 因此,我无法搜索example.com

import-re
html=“”var player=new Clappr.player({
资料来源:[”https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4","https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4"]
发信人:"image.jpg",
宽度:“100%”,
高度:“100%”,
disableVideoTagContextMenu:真,
parentId:#vplayer“,
活动:{
onReady:function(){},
}"""
match=re.findall(r“https.+?mp4”,html)
打印(匹配)
输出:

['https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4', 'https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4']
["https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4","https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4

match=re.search(r“sources:(\[.+\])”,html).group(1)
打印(匹配)
输出:

['https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4', 'https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4']
["https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4","https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4