Python 如何从帖子中获取html文本&;登录后LinkedIn上的活动?

Python 如何从帖子中获取html文本&;登录后LinkedIn上的活动?,python,selenium,web-scraping,Python,Selenium,Web Scraping,我正在学习网络抓取,我正在尝试收集LinkedIn帖子上的喜欢数量。登录后,我重定向到Posts&Activity页面,但我无法获取post的html文本以从中提取数据 from selenium import webdriver from selenium.webdriver.common.keys import Keys from bs4 import BeautifulSoup import requests # Creation of a new instance of Google

我正在学习网络抓取,我正在尝试收集LinkedIn帖子上的喜欢数量。登录后,我重定向到Posts&Activity页面,但我无法获取post的html文本以从中提取数据

from selenium import webdriver 
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import requests

# Creation of a new instance of Google Chrome
browser = webdriver.Chrome('PATH/chromedriver.exe')
browser.get('https://www.linkedin.com/login')

browser.find_element_by_id('username').send_keys('****')
browser.find_element_by_id('password').send_keys('****' + Keys.RETURN)

#Go to Posts and Activity --> Posts
browser.get('https://www.linkedin.com/in/LINKEDIN_PROFILE_NAME/detail/recent-activity/shares/')

URL = 'https://www.linkedin.com/in/LINKEDIN_PROFILE_NAME/detail/recent-activity/shares/'
source = requests.get(URL).text

soup = BeautifulSoup(source, 'lxml')
print(soup)
以下是我得到的输出:

<html><head>
<script type="text/javascript">
window.onload = function() {
  // Parse the tracking code from cookies.
  var trk = "bf";
  var trkInfo = "bf";
  var cookies = document.cookie.split("; ");
  for (var i = 0; i < cookies.length; ++i) {
    if ((cookies[i].indexOf("trkCode=") == 0) && (cookies[i].length > 8)) {
      trk = cookies[i].substring(8);
    }
    else if ((cookies[i].indexOf("trkInfo=") == 0) && (cookies[i].length > 8)) {
      trkInfo = cookies[i].substring(8);
    }
  }

  if (window.location.protocol == "http:") {
    // If "sl" cookie is set, redirect to https.
    for (var i = 0; i < cookies.length; ++i) {
      if ((cookies[i].indexOf("sl=") == 0) && (cookies[i].length > 3)) {
        window.location.href = "https:" + window.location.href.substring(window.location.protocol.length);
        return;
      }
    }
  }

  // Get the new domain. For international domains such as
  // fr.linkedin.com, we convert it to www.linkedin.com
  var domain = "www.linkedin.com";
  if (domain != location.host) {
    var subdomainIndex = location.host.indexOf(".linkedin");
    if (subdomainIndex != -1) {
      domain = "www" + location.host.substring(subdomainIndex);
    }
  }

  window.location.href = "https://" + domain + "/authwall?trk=" + trk + "&trkInfo=" + trkInfo +
      "&originalReferer=" + document.referrer.substr(0, 200) +
      "&sessionRedirect=" + encodeURIComponent(window.location.href);
}
</script>
</head></html>

window.onload=函数(){
//从cookies解析跟踪代码。
var trk=“bf”;
var trkInfo=“bf”;
var cookies=document.cookie.split(“;”);
对于(变量i=0;i8)){
trk=cookies[i].子串(8);
}
else if((cookies[i].indexOf(“trkInfo=”)=0)和&(cookies[i].length>8)){
trkInfo=cookies[i].子串(8);
}
}
如果(window.location.protocol==“http:”){
//如果设置了“sl”cookie,则重定向到https。
对于(变量i=0;i3)){
window.location.href=“https:”+window.location.href.substring(window.location.protocol.length);
返回;
}
}
}
//获取新域。对于国际域,如
//fr.linkedin.com,我们将其转换为www.linkedin.com
var domain=“www.linkedin.com”;
如果(域!=location.host){
var subdomainIndex=location.host.indexOf(“.linkedin”);
如果(子域索引!=-1){
domain=“www”+location.host.substring(subdomainIndex);
}
}
window.location.href=“https://“+domain+”/authwall?trk=“+trk+”&trkInfo=“+trkInfo+
“&originalReferer=“+document.referer.substr(0,200)+
“&sessionRedirect=“+encodeURIComponent(window.location.href));
}

请帮助,我做错了什么?

不要使用BS4源代码使用browser.page\u源代码。