Python 由于私密模式检测，urllib3无法打开与urllib2相同的文章_Python_Python 3.x_Beautifulsoup_Urllib

Python 由于私密模式检测，urllib3无法打开与urllib2相同的文章

python python-3.x

Python 由于私密模式检测，urllib3无法打开与urllib2相同的文章,python,python-3.x,beautifulsoup,urllib,Python,Python 3.x,Beautifulsoup,Urllib,如何使用urllib3绕过私有模式检测。我有以下不起作用的代码： import urllib3 from bs4 import BeautifulSoup articleURL = "https://www.washingtonpost.com/news/the-switch/wp/2016/10/18/the-pentagons-massive-new-telescope-is-designed-to-track-space-junk-and-watch-out-for-kille

如何使用urllib3绕过私有模式检测。我有以下不起作用的代码：

import urllib3
from bs4 import BeautifulSoup

articleURL = "https://www.washingtonpost.com/news/the-switch/wp/2016/10/18/the-pentagons-massive-new-telescope-is-designed-to-track-space-junk-and-watch-out-for-killer-asteroids/"

import urllib3
from bs4 import BeautifulSoup
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

http = urllib3.PoolManager()
response = http.request('GET', articleURL)
soup = BeautifulSoup(response.data.decode('utf-8', 'ignore'))
soup

这会产生以下错误：

    </script> <script>var _0x108f=["blockers","pb-adblock-checked","resolve","all","overlay","mobile","desktop","browsers","max","isAnon","isSubscriber","Features","displayOverlay","extListener","getTime","performance","timing","navigationStart","registerPwapiConsumer","getOwnPropertyDescriptor","get","reject","notdetected","standard","notblocked","stack","validate","addEventListener","pb-core-loaded","iterator","symbol","function","constructor","prototype","assign","apply","Keep supporting great journalism by turning off your ad blocker. Or purchase a subscription for unlimited access to real news you can count on.",
'\x3ca data-link-ff\x3d"https://www.washingtonpost.com/steps-for-disabling-firefoxs-native-adblocker/2018/05/21/fb95bf4e-5d37-11e8-b2b8-08a538d9dbd6_story.html" data-link\x3d"https://www.washingtonpost.com/steps-for-disabling-adblocker/2016/09/14/a8c3d4d2-7aac-11e6-bd86-b7bbd53d2b5d_story.html" href\x3d"https://www.washingtonpost.com/steps-for-disabling-adblocker/2016/09/14/a8c3d4d2-7aac-11e6-bd86-b7bbd53d2b5d_story.html"\x3eUnblock ads\x3c/a\x3e','\x3ca href\x3d"https://subscribe.washingtonpost.com/acq/?promo\x3do12" target\x3d"_blank"\x3e\x3cspan class\x3d"subscribe-link"\x3eTry 1 month for $1\x3c/span\x3e\x3c/a\x3e',
"event 86","We noticed you\u2019re browsing in private mode.","Private browsing is permitted exclusively for our subscribers. Turn off private browsing to keep reading this story, or subscribe to use this feature, plus get unlimited digital access.",'\x3ca data-link-ff\x3d"https://helpcenter.washingtonpost.com/hc/en-us/articles/360028029392l" data-link\x3d"https://helpcenter.washingtonpost.com/hc/en-us/articles/360028029392" href\x3d"https://helpcenter.washingtonpost.com/hc/en-us/articles/360028029392"\x3eTurn off private browsing\x3c/a\x3e'

尝试此更改（您需要指定

用户代理

标题）：

是的，使用UA欺骗进行工作。谢谢

import urllib2
from bs4 import BeautifulSoup

articleURL = "https://www.washingtonpost.com/news/the-switch/wp/2016/10/18/the-pentagons-massive-new-telescope-is-designed-to-track-space-junk-and-watch-out-for-killer-asteroids/"

page = urllib2.urlopen(articleURL).read().decode('utf8','ignore') 
soup = BeautifulSoup(page,"lxml")
soup

headers = {'user-agent': 'Mozilla/5.0 (Windows NT x.y; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0'}
response = http.request('GET', articleURL, headers=headers)