Python 从instagram获取href的方法

Python 从instagram获取href的方法,python,api,beautifulsoup,instagram,Python,Api,Beautifulsoup,Instagram,除了selenium之外,是否有其他方法可以从“”获取“a href”? 在api的帮助下,我只能获得以下类型图片的链接: 我不需要这个。我想得到这样的链接“”。我试图使用bs4和“lxml”解析器来实现这一点,但得到的结果在html中没有“a href”。 我需要知道是否有可能获取这些信息?很明显,javascript会生成更多的信息。因此,除了SeleniumWebDriver之外,这是一种刮取这些数据的方法吗 您要查找的所有信息都在 您可以使用以下正则表达式获得它: from bs4 im

除了selenium之外,是否有其他方法可以从“”获取“a href”? 在api的帮助下,我只能获得以下类型图片的链接: 我不需要这个。我想得到这样的链接“”。我试图使用bs4和“lxml”解析器来实现这一点,但得到的结果在html中没有“a href”。
我需要知道是否有可能获取这些信息?很明显,javascript会生成更多的信息。因此,除了SeleniumWebDriver之外,这是一种刮取这些数据的方法吗

您要查找的所有信息都在

您可以使用以下正则表达式获得它:

from bs4 import BeautifulSoup as soup
import requests
import json
import re

def _get_json_footer(html):
    s = str(html)
    r = re.compile('"entry_data":(.*?),"gatekeepers"')
    m = r.search(s)
    if m:
        result = m.group(1)
    return json.loads(result)

url = 'https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/'
page = requests.get(url)
html = soup(page.text, 'html.parser')
json_footer = _get_json_footer(html)

tagpage = json_footer.get('TagPage')
然后,您可以在
tagpage
dict中导航以获取数据

编辑:

要获得posts链接,您只需在
标记页
目录中导航即可:

from bs4 import BeautifulSoup as soup
import requests
import json
import re

def _get_json_footer(html):
    s = str(html)
    r = re.compile('"entry_data":(.*?),"gatekeepers"')
    m = r.search(s)
    if m:
        result = m.group(1)
    return json.loads(result)

url = 'https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/'
page = requests.get(url)
html = soup(page.text, 'html.parser')
json_footer = _get_json_footer(html)

tagpage = json_footer.get('TagPage')

links = []
edges = tagpage[0].get('graphql',{}).get('hashtag',{}).get('edge_hashtag_to_media',{}).get('edges',[])
for e in edges:
    links.append("https://www.instagram.com/p/"+e.get('node',{}).get('shortcode','')+'/')

print(links)
输出:

['https://www.instagram.com/p/Bsh4UcdBRvY/', 'https://www.instagram.com/p/Bq8vAMRHtGB/', 'https://www.instagram.com/p/Bn_vfeWhcYL/', 'https://www.instagram.com/p/Bm1QRb2ntWL/', 'https://www.instagram.com/p/Bj5pLHAnVuY/', 'https://www.instagram.com/p/Bfn2QWiHKK5/', 'https://www.instagram.com/p/BfC4ZnTntq0/', 'https://www.instagram.com/p/BeomaB6Hb8-/', 'https://www.instagram.com/p/vYszwjyLdB/', 'https://www.instagram.com/p/sQI6Jfpi3f/', 'https://www.instagram.com/p/sO9oXPMr6K/', 'https://www.instagram.com/p/qzvHuCHUgH/', 'https://www.instagram.com/p/WdlKcCBW3w/']


您可以通过
edge\u hashtag\u to\u top\u posts
将键
edge\u hashtag\u更改为\u media
,以获取其他值

让我知道这是您需要的

from bs4 import BeautifulSoup
import requests
resp=requests.get("https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/")
html = resp.content
soup = BeautifulSoup(html,'html.parser')


for a in soup.find_all('link',rel='alternate',href=True):
    print "Found the URL:", a['href']
输出:

Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=en
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=fr
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=it
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=de
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=zh-cn
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=zh-tw
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=ja
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=ko
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=pt
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=pt-br
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=af
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=cs
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=da
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=el
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=fi
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=hr
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=hu
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=id
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=ms
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=nb
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=nl
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=pl
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=ru
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=sk
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=sv
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=th
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=tl
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=tr
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=hi
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=bn
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=gu
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=kn
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=ml
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=mr
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=pa
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=ta
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=te
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=ne
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=si
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=ur
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=vi
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=bg
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=fr-ca
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=ro
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=sr
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=uk
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=zh-hk
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la
Found the URL: https://www.instagram.com/explore/tags/SOMEHASHTAGHERE/?hl=es-la

感谢您的回答,但它返回的列表中包含json,但没有所需的url…我需要获取此类链接“”,而不是此链接“”。@OlegRadchenko请查看我的编辑并告诉我是否有帮助我需要从此url“”获取此类链接“”,如果有帮助,请告诉我。