Python 3.x 如何使用Python请求模块登录web?

Python 3.x 如何使用Python请求模块登录web?,python-3.x,authentication,python-requests,Python 3.x,Authentication,Python Requests,我一直在读关于并尝试几种不同的方法 但是,在web身份验证方面存在一个问题 Testing site: http://testing-ground.scraping.pro/login Username: admin Password: 12345 下面是示例代码 >>> import requests, re >>> url = 'http://testing-ground.scraping.pro/login' >>> username

我一直在读关于并尝试几种不同的方法

但是,在web身份验证方面存在一个问题

Testing site: http://testing-ground.scraping.pro/login
Username: admin
Password: 12345
下面是示例代码

>>> import requests, re
>>> url = 'http://testing-ground.scraping.pro/login'
>>> username = 'admin'
>>> password = '12345'
>>> requests.get(url)
<Response [200]>
导入请求,重新 >>>url='1〕http://testing-ground.scraping.pro/login' >>>用户名='admin' >>>密码='12345' >>>requests.get(url) 未经认证

>>> print(requests.get(url).text)
<!DOCTYPE html>
<!--[if lt IE 7]>      <html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
<!--[if IE 7]>         <html class="no-js lt-ie9 lt-ie8"> <![endif]-->
<!--[if IE 8]>         <html class="no-js lt-ie9"> <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js"> <!--<![endif]-->
    <head>
        <meta charset="utf-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
        <title>Web Scraper Testing Ground</title>
        <meta name="description" content="">
        <meta name="viewport" content="width=device-width">
        <link rel="stylesheet" href="/css/normalize.css">
        <link rel="stylesheet" href="/css/main.css">
        <script src="/js/vendor/modernizr-2.6.1.min.js"></script>
        <script src="/js/vendor/jquery-1.9.1.min.js"></script>
        <script src="/js/vendor/jquery-ui-1.10.2.min.js"></script>
        <script src="/js/plugins.js"></script>
        <script src="/js/main.js"></script>

        <link rel="stylesheet" href="/css/QapTcha.jquery.css" />
        <script src="/js/QapTcha.jquery.js"></script>
        
        <link rel="stylesheet" href="/fancy-captcha/captcha.css" />
        <script src="/fancy-captcha/jquery.captcha.js"></script>

    </head>
    <body>
        <script type="text/javascript">
        
          var _gaq = _gaq || [];
          _gaq.push(['_setAccount', 'UA-4436411-8']);
          _gaq.push(['_setDomainName', 'extract-web-data.com']);
          _gaq.push(['_trackPageview']);
        
          (function() {
            var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
            ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
            var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
          })();
        
        </script>
        <!--[if lt IE 7]>
            <p class="chromeframe">You are using an outdated browser. <a href="http://browsehappy.com/">Upgrade your browser today</a> or <a href="http://www.google.com/chromeframe/?redirect=true">install Google Chrome Frame</a> to better experience this site.</p>
        <![endif]-->
        <div id="topbar"></div>
        <a href="/" style="text-decoration: none">
            <div id="title">WEB SCRAPER TESTING GROUND</div>
            <div id="logo"></div>
        </a>
        <div id="content">
<h1>LOGIN</h1>
<div id="caseinfo">Often in order to reach the desired information you need to be logged in to the website. Most of today's websites use so-called form-based authentication which implies sending user credentials using POST method, authenticating it on the server and storing user's session in a cookie.</p>
<p>This simple test shows scraper's ability to:</p>
    <ol>
        <li>Send user credentials via POST method</li>
        <li>Receive, Keep and Return a session cookie</li>
        <li>Process HTTP redirect (302)</li>
    </ol>
<p>How to test:</p>
    <ol>
        <li>Enter <b>admin</b> and <b>12345</b> in the form below and press <b>Login</b></li>
        <li>If you see <span class="success">WELCOME :)</span> then the user credentials were sent, the cookie was passed and HTTP redirect was processed</li>
        <li>If you see <span class="error">ACCESS DENIED!</span> then either you entered wrong credentials or they were not sent to the server properly</li>
        <li>If you see <span class="error">THE SESSION COOKIE IS MISSING OR HAS A WRONG VALUE!</span> then the user credentials were properly sent but the session cookie was not properly stored or passed</li>
        <li>If you see <span class="success">REDIRECTING...</span> then the user credentials were properly sent but HTTP redirection was not processed</li>
        <li>Click <b>GO BACK</b> to start again</li>
    </ol>
</div>

<hr/>

<div id="case_login">
<h3>Please, login:</h3>
    <form action="login?mode=login" method="POST">
        <label for="usr">User name:</label>
        <input id="usr" name="usr" type="text" placeholder="enter 'admin' here">
        <label for="pwd">Password:</label>
        <input id="pwd" name="pwd" type="text" placeholder="enter '12345' here">
        <input type="submit" value="Login">
    </form>
</div>
<br/><br/><br/>
        </div>
    </body>
</html>
>>> 
>>> print(requests.get(url, auth=(username, password)).text)
<!DOCTYPE html>
<!--[if lt IE 7]>      <html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
<!--[if IE 7]>         <html class="no-js lt-ie9 lt-ie8"> <![endif]-->
<!--[if IE 8]>         <html class="no-js lt-ie9"> <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js"> <!--<![endif]-->
    <head>
        <meta charset="utf-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
        <title>Web Scraper Testing Ground</title>
        <meta name="description" content="">
        <meta name="viewport" content="width=device-width">
        <link rel="stylesheet" href="/css/normalize.css">
        <link rel="stylesheet" href="/css/main.css">
        <script src="/js/vendor/modernizr-2.6.1.min.js"></script>
        <script src="/js/vendor/jquery-1.9.1.min.js"></script>
        <script src="/js/vendor/jquery-ui-1.10.2.min.js"></script>
        <script src="/js/plugins.js"></script>
        <script src="/js/main.js"></script>

        <link rel="stylesheet" href="/css/QapTcha.jquery.css" />
        <script src="/js/QapTcha.jquery.js"></script>
        
        <link rel="stylesheet" href="/fancy-captcha/captcha.css" />
        <script src="/fancy-captcha/jquery.captcha.js"></script>

    </head>
    <body>
        <script type="text/javascript">
        
          var _gaq = _gaq || [];
          _gaq.push(['_setAccount', 'UA-4436411-8']);
          _gaq.push(['_setDomainName', 'extract-web-data.com']);
          _gaq.push(['_trackPageview']);
        
          (function() {
            var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
            ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
            var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
          })();
        
        </script>
        <!--[if lt IE 7]>
            <p class="chromeframe">You are using an outdated browser. <a href="http://browsehappy.com/">Upgrade your browser today</a> or <a href="http://www.google.com/chromeframe/?redirect=true">install Google Chrome Frame</a> to better experience this site.</p>
        <![endif]-->
        <div id="topbar"></div>
        <a href="/" style="text-decoration: none">
            <div id="title">WEB SCRAPER TESTING GROUND</div>
            <div id="logo"></div>
        </a>
        <div id="content">
<h1>LOGIN</h1>
<div id="caseinfo">Often in order to reach the desired information you need to be logged in to the website. Most of today's websites use so-called form-based authentication which implies sending user credentials using POST method, authenticating it on the server and storing user's session in a cookie.</p>
<p>This simple test shows scraper's ability to:</p>
    <ol>
        <li>Send user credentials via POST method</li>
        <li>Receive, Keep and Return a session cookie</li>
        <li>Process HTTP redirect (302)</li>
    </ol>
<p>How to test:</p>
    <ol>
        <li>Enter <b>admin</b> and <b>12345</b> in the form below and press <b>Login</b></li>
        <li>If you see <span class="success">WELCOME :)</span> then the user credentials were sent, the cookie was passed and HTTP redirect was processed</li>
        <li>If you see <span class="error">ACCESS DENIED!</span> then either you entered wrong credentials or they were not sent to the server properly</li>
        <li>If you see <span class="error">THE SESSION COOKIE IS MISSING OR HAS A WRONG VALUE!</span> then the user credentials were properly sent but the session cookie was not properly stored or passed</li>
        <li>If you see <span class="success">REDIRECTING...</span> then the user credentials were properly sent but HTTP redirection was not processed</li>
        <li>Click <b>GO BACK</b> to start again</li>
    </ol>
</div>

<hr/>

<div id="case_login">
<h3>Please, login:</h3>
    <form action="login?mode=login" method="POST">
        <label for="usr">User name:</label>
        <input id="usr" name="usr" type="text" placeholder="enter 'admin' here">
        <label for="pwd">Password:</label>
        <input id="pwd" name="pwd" type="text" placeholder="enter '12345' here">
        <input type="submit" value="Login">
    </form>
</div>
<br/><br/><br/>
        </div>
    </body>
</html>
>>> 
打印(requests.get(url.text) 刮网机试验场 var _gaq=_gaq | |[]; _gaq.push([''设置帐户','UA-4436411-8']); _gaq.push([''u setDomainName','extract web data.com']); _gaq.push([''u trackPageview']); (功能(){ var ga=document.createElement('script');ga.type='text/javascript';ga.async=true; ga.src=('https:'==document.location.protocol?'https://ssl' : 'http://www“)+”.google analytics.com/ga.js'; var s=document.getElementsByTagName('script')[0];s.parentNode.insertBefore(ga,s); })(); 登录 通常,为了获得所需信息,您需要登录网站。今天的大多数网站使用所谓的基于表单的身份验证,这意味着使用POST方法发送用户凭据,在服务器上进行身份验证,并将用户会话存储在cookie中

这项简单的测试表明,铲运机能够:

  • 通过POST方法发送用户凭据
  • 接收、保留和返回会话cookie
  • 进程HTTP重定向(302)
  • 如何测试:

  • 在下面的表格中输入admin和12345,然后按Login
  • 如果您看到欢迎:),则发送了用户凭据,传递了cookie,并处理了HTTP重定向
  • 如果您看到访问被拒绝!然后,您输入了错误的凭据,或者凭据未正确发送到服务器
  • 如果您看到会话COOKIE丢失或具有错误的值!然后,用户凭据已正确发送,但会话cookie未正确存储或传递
  • 如果你看到重定向。。。然后正确发送了用户凭据,但未处理HTTP重定向
  • 单击“返回”重新开始

  • 请登录: 用户名: 密码:


    >>> 通过身份验证

    >>> print(requests.get(url).text)
    <!DOCTYPE html>
    <!--[if lt IE 7]>      <html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
    <!--[if IE 7]>         <html class="no-js lt-ie9 lt-ie8"> <![endif]-->
    <!--[if IE 8]>         <html class="no-js lt-ie9"> <![endif]-->
    <!--[if gt IE 8]><!--> <html class="no-js"> <!--<![endif]-->
        <head>
            <meta charset="utf-8">
            <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
            <title>Web Scraper Testing Ground</title>
            <meta name="description" content="">
            <meta name="viewport" content="width=device-width">
            <link rel="stylesheet" href="/css/normalize.css">
            <link rel="stylesheet" href="/css/main.css">
            <script src="/js/vendor/modernizr-2.6.1.min.js"></script>
            <script src="/js/vendor/jquery-1.9.1.min.js"></script>
            <script src="/js/vendor/jquery-ui-1.10.2.min.js"></script>
            <script src="/js/plugins.js"></script>
            <script src="/js/main.js"></script>
    
            <link rel="stylesheet" href="/css/QapTcha.jquery.css" />
            <script src="/js/QapTcha.jquery.js"></script>
            
            <link rel="stylesheet" href="/fancy-captcha/captcha.css" />
            <script src="/fancy-captcha/jquery.captcha.js"></script>
    
        </head>
        <body>
            <script type="text/javascript">
            
              var _gaq = _gaq || [];
              _gaq.push(['_setAccount', 'UA-4436411-8']);
              _gaq.push(['_setDomainName', 'extract-web-data.com']);
              _gaq.push(['_trackPageview']);
            
              (function() {
                var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
                ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
                var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
              })();
            
            </script>
            <!--[if lt IE 7]>
                <p class="chromeframe">You are using an outdated browser. <a href="http://browsehappy.com/">Upgrade your browser today</a> or <a href="http://www.google.com/chromeframe/?redirect=true">install Google Chrome Frame</a> to better experience this site.</p>
            <![endif]-->
            <div id="topbar"></div>
            <a href="/" style="text-decoration: none">
                <div id="title">WEB SCRAPER TESTING GROUND</div>
                <div id="logo"></div>
            </a>
            <div id="content">
    <h1>LOGIN</h1>
    <div id="caseinfo">Often in order to reach the desired information you need to be logged in to the website. Most of today's websites use so-called form-based authentication which implies sending user credentials using POST method, authenticating it on the server and storing user's session in a cookie.</p>
    <p>This simple test shows scraper's ability to:</p>
        <ol>
            <li>Send user credentials via POST method</li>
            <li>Receive, Keep and Return a session cookie</li>
            <li>Process HTTP redirect (302)</li>
        </ol>
    <p>How to test:</p>
        <ol>
            <li>Enter <b>admin</b> and <b>12345</b> in the form below and press <b>Login</b></li>
            <li>If you see <span class="success">WELCOME :)</span> then the user credentials were sent, the cookie was passed and HTTP redirect was processed</li>
            <li>If you see <span class="error">ACCESS DENIED!</span> then either you entered wrong credentials or they were not sent to the server properly</li>
            <li>If you see <span class="error">THE SESSION COOKIE IS MISSING OR HAS A WRONG VALUE!</span> then the user credentials were properly sent but the session cookie was not properly stored or passed</li>
            <li>If you see <span class="success">REDIRECTING...</span> then the user credentials were properly sent but HTTP redirection was not processed</li>
            <li>Click <b>GO BACK</b> to start again</li>
        </ol>
    </div>
    
    <hr/>
    
    <div id="case_login">
    <h3>Please, login:</h3>
        <form action="login?mode=login" method="POST">
            <label for="usr">User name:</label>
            <input id="usr" name="usr" type="text" placeholder="enter 'admin' here">
            <label for="pwd">Password:</label>
            <input id="pwd" name="pwd" type="text" placeholder="enter '12345' here">
            <input type="submit" value="Login">
        </form>
    </div>
    <br/><br/><br/>
            </div>
        </body>
    </html>
    >>> 
    
    >>> print(requests.get(url, auth=(username, password)).text)
    <!DOCTYPE html>
    <!--[if lt IE 7]>      <html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
    <!--[if IE 7]>         <html class="no-js lt-ie9 lt-ie8"> <![endif]-->
    <!--[if IE 8]>         <html class="no-js lt-ie9"> <![endif]-->
    <!--[if gt IE 8]><!--> <html class="no-js"> <!--<![endif]-->
        <head>
            <meta charset="utf-8">
            <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
            <title>Web Scraper Testing Ground</title>
            <meta name="description" content="">
            <meta name="viewport" content="width=device-width">
            <link rel="stylesheet" href="/css/normalize.css">
            <link rel="stylesheet" href="/css/main.css">
            <script src="/js/vendor/modernizr-2.6.1.min.js"></script>
            <script src="/js/vendor/jquery-1.9.1.min.js"></script>
            <script src="/js/vendor/jquery-ui-1.10.2.min.js"></script>
            <script src="/js/plugins.js"></script>
            <script src="/js/main.js"></script>
    
            <link rel="stylesheet" href="/css/QapTcha.jquery.css" />
            <script src="/js/QapTcha.jquery.js"></script>
            
            <link rel="stylesheet" href="/fancy-captcha/captcha.css" />
            <script src="/fancy-captcha/jquery.captcha.js"></script>
    
        </head>
        <body>
            <script type="text/javascript">
            
              var _gaq = _gaq || [];
              _gaq.push(['_setAccount', 'UA-4436411-8']);
              _gaq.push(['_setDomainName', 'extract-web-data.com']);
              _gaq.push(['_trackPageview']);
            
              (function() {
                var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
                ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
                var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
              })();
            
            </script>
            <!--[if lt IE 7]>
                <p class="chromeframe">You are using an outdated browser. <a href="http://browsehappy.com/">Upgrade your browser today</a> or <a href="http://www.google.com/chromeframe/?redirect=true">install Google Chrome Frame</a> to better experience this site.</p>
            <![endif]-->
            <div id="topbar"></div>
            <a href="/" style="text-decoration: none">
                <div id="title">WEB SCRAPER TESTING GROUND</div>
                <div id="logo"></div>
            </a>
            <div id="content">
    <h1>LOGIN</h1>
    <div id="caseinfo">Often in order to reach the desired information you need to be logged in to the website. Most of today's websites use so-called form-based authentication which implies sending user credentials using POST method, authenticating it on the server and storing user's session in a cookie.</p>
    <p>This simple test shows scraper's ability to:</p>
        <ol>
            <li>Send user credentials via POST method</li>
            <li>Receive, Keep and Return a session cookie</li>
            <li>Process HTTP redirect (302)</li>
        </ol>
    <p>How to test:</p>
        <ol>
            <li>Enter <b>admin</b> and <b>12345</b> in the form below and press <b>Login</b></li>
            <li>If you see <span class="success">WELCOME :)</span> then the user credentials were sent, the cookie was passed and HTTP redirect was processed</li>
            <li>If you see <span class="error">ACCESS DENIED!</span> then either you entered wrong credentials or they were not sent to the server properly</li>
            <li>If you see <span class="error">THE SESSION COOKIE IS MISSING OR HAS A WRONG VALUE!</span> then the user credentials were properly sent but the session cookie was not properly stored or passed</li>
            <li>If you see <span class="success">REDIRECTING...</span> then the user credentials were properly sent but HTTP redirection was not processed</li>
            <li>Click <b>GO BACK</b> to start again</li>
        </ol>
    </div>
    
    <hr/>
    
    <div id="case_login">
    <h3>Please, login:</h3>
        <form action="login?mode=login" method="POST">
            <label for="usr">User name:</label>
            <input id="usr" name="usr" type="text" placeholder="enter 'admin' here">
            <label for="pwd">Password:</label>
            <input id="pwd" name="pwd" type="text" placeholder="enter '12345' here">
            <input type="submit" value="Login">
        </form>
    </div>
    <br/><br/><br/>
            </div>
        </body>
    </html>
    >>> 
    
    打印(requests.get(url,auth=(用户名,密码)).text) 刮网机试验场 var _gaq=_gaq | |[]; _gaq.push([''设置帐户','UA-4436411-8']); _gaq.push([''u setDomainName','extract web data.com']); _gaq.push([''u trackPageview']); (功能(){ var ga=document.createElement('script');ga.type='text/javascript';ga.async=true; ga.src=('https:'==document.location.protocol?'https://ssl' : 'http://www“)+”.google analytics.com/ga.js'; var s=document.getElementsByTagName('script')[0];s.parentNode.insertBefore(ga,s); })(); 登录 通常,为了获得所需信息,您需要登录网站。今天的大多数网站使用所谓的基于表单的身份验证,这意味着使用POST方法发送用户凭据,在服务器上进行身份验证,并将用户会话存储在cookie中

    这项简单的测试表明,铲运机能够:

  • 通过POST方法发送用户凭据
  • 接收、保留和返回会话cookie
  • 进程HTTP重定向(302)
  • 如何测试:

  • 在下面的表格中输入admin和12345,然后按Login
  • 如果您看到欢迎:),则发送了用户凭据,传递了cookie,并处理了HTTP重定向
  • 如果您看到访问被拒绝!然后,您输入了错误的凭据,或者凭据未正确发送到服务器
  • 如果您看到会话COOKIE丢失或具有错误的值!然后,用户凭据已正确发送,但会话cookie未正确存储或传递
  • 如果你看到重定向。。。然后正确发送了用户凭据,但未处理HTTP重定向
  • 单击“返回”重新开始

  • 请登录: 用户名: 密码:


    >>> 由于输出中有一个web登录表单,我认为身份验证没有按预期工作

    <h3>Please, login:</h3>
        <form action="login?mode=login" method="POST">
            <label for="usr">User name:</label>
            <input id="usr" name="usr" type="text" placeholder="enter 'admin' here">
            <label for="pwd">Password:</label>
            <input id="pwd" name="pwd" type="text" placeholder="enter '12345' here">
            <input type="submit" value="Login">
        </form>
    
    请登录:
    用户名:
    密码:
    

    在这种情况下有什么问题,我应该怎么做才能解决它?

    您应该在登录页面的引导下发布一篇文章:

    
    >>>导入请求,重新
    >>>url='1〕http://testing-ground.scraping.pro/login?mode=login'
    >>>用户名='admin'
    >>>密码='12345'
    >>>post(url,数据={'usr':用户名,'pwd':密码})
    
    谢谢@avloss,我忽略了http方法。应该是post,但我用get试过了。