Python 使用bs4查找链接
我正在尝试使用bs4从脚本标记获取链接 这是我想从中删除链接的标签Python 使用bs4查找链接,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,我正在尝试使用bs4从脚本标记获取链接 这是我想从中删除链接的标签 html = """<script type="text/javascript">var player = new Clappr.Player({ sources: ["https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4","https://example.com/zx5x4vxkb52d
html = """<script type="text/javascript">var player = new Clappr.Player({
sources: ["https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4","https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4"]
poster: "image.jpg",
width: "100%",
height: "100%",
disableVideoTagContextMenu: true,
parentId: "#vplayer",
events: {
onReady: function() { },
}"""
链接匹配,所以我只需要其中一个
注:
doamin名称每次都会更改
因此,我无法搜索example.comimport-re
html=“”var player=new Clappr.player({
资料来源:[”https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4","https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4"]
发信人:"image.jpg",
宽度:“100%”,
高度:“100%”,
disableVideoTagContextMenu:真,
parentId:#vplayer“,
活动:{
onReady:function(){},
}"""
match=re.findall(r“https.+?mp4”,html)
打印(匹配)
输出:
['https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4', 'https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4']
["https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4","https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4
或
match=re.search(r“sources:(\[.+\])”,html).group(1)
打印(匹配)
输出:
['https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4', 'https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4']
["https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkdqatbvqbc5axyv4dpuq/v.mp4","https://example.com/zx5x4vxkb52dxcne4zwsbbn6rpafhxnsodnlcjifkyyarbvqbc5dtluomera/v.mp4