如何使用Python登录网站？_Python_Automation_Httpclient_Webautomation

如何使用Python登录网站？

python automation

如何使用Python登录网站？,python,automation,httpclient,webautomation,Python,Automation,Httpclient,Webautomation,我怎么做？我试图输入一些指定的链接（使用urllib），但要做到这一点，我需要登录我从网站上获得了以下信息： <form id="login-form" action="auth/login" method="post"> <div> <!--label for="rememberme">Remember me</label><input type="checkbox" class="remember" checked="

我怎么做？我试图输入一些指定的链接（使用urllib），但要做到这一点，我需要登录

我从网站上获得了以下信息：

<form id="login-form" action="auth/login" method="post">
    <div>
    <!--label for="rememberme">Remember me</label><input type="checkbox" class="remember" checked="checked" name="remember me" /-->
    <label for="email" id="email-label" class="no-js">Email</label>
    <input id="email-email" type="text" name="handle" value="" autocomplete="off" />
    <label for="combination" id="combo-label" class="no-js">Combination</label>
    <input id="password-clear" type="text" value="Combination" autocomplete="off" />
    <input id="password-password" type="password" name="password" value="" autocomplete="off" />
    <input id="sumbitLogin" class="signin" type="submit" value="Sign In" />


电子邮件
组合

这可能吗？

一般来说，网站可以通过许多不同的方式检查授权，但您所针对的方式似乎让您的检查变得相当容易

您只需

POST

到

auth/login

URL一个表单编码的blob，其中包含您在那里看到的各种字段（忘记

的标签，它们是人类访问者的装饰品）handle=whatever&password clear=pwd
等等，只要您知道句柄（又称电子邮件）和密码的值，您就可以了
想必，这篇文章会将您重定向到某个“您已成功登录”页面，该页面带有验证会话的Set Cookie
标题（请确保保存该Cookie，并在会话的进一步交互过程中将其发送回！）。
也许您想使用它。它很容易使用，应该能够做你想做的事情
它将如下所示：
from twill.commands import *
go('http://example.org')

fv("1", "email-email", "blabla.com")
fv("1", "password-clear", "testpass")

submit('0')

使用go…
浏览到要登录的站点后，可以使用showforms（）
列出所有表单。只需从python解释器中尝试一下
import cookielib
import urllib
import urllib2

url = 'http://www.someserver.com/auth/login'
values = {'email-email' : 'john@example.com',
          'password-clear' : 'Combination',
          'password-password' : 'mypassword' }

data = urllib.urlencode(values)
cookies = cookielib.CookieJar()

opener = urllib2.build_opener(
    urllib2.HTTPRedirectHandler(),
    urllib2.HTTPHandler(debuglevel=0),
    urllib2.HTTPSHandler(debuglevel=0),
    urllib2.HTTPCookieProcessor(cookies))

response = opener.open(url, data)
the_page = response.read()
http_headers = response.info()
# The login cookies should be contained in the cookies variable

有关更多信息，请访问：
通常您需要cookies才能登录站点，这意味着cookielib、urllib和urllib2。这是我在玩Facebook网络游戏时写的一节课：
import cookielib
import urllib
import urllib2

# set these to whatever your fb account is
fb_username = "your@facebook.login"
fb_password = "secretpassword"

class WebGamePlayer(object):

    def __init__(self, login, password):
        """ Start up... """
        self.login = login
        self.password = password

        self.cj = cookielib.CookieJar()
        self.opener = urllib2.build_opener(
            urllib2.HTTPRedirectHandler(),
            urllib2.HTTPHandler(debuglevel=0),
            urllib2.HTTPSHandler(debuglevel=0),
            urllib2.HTTPCookieProcessor(self.cj)
        )
        self.opener.addheaders = [
            ('User-agent', ('Mozilla/4.0 (compatible; MSIE 6.0; '
                           'Windows NT 5.2; .NET CLR 1.1.4322)'))
        ]

        # need this twice - once to set cookies, once to log in...
        self.loginToFacebook()
        self.loginToFacebook()

    def loginToFacebook(self):
        """
        Handle login. This should populate our cookie jar.
        """
        login_data = urllib.urlencode({
            'email' : self.login,
            'pass' : self.password,
        })
        response = self.opener.open("https://login.facebook.com/login.php", login_data)
        return ''.join(response.readlines())

您不一定需要HTTPS或重定向处理程序，但它们不会造成伤害，而且它使开放程序更加健壮。您可能也不需要cookies，但仅从您发布的表单很难判断。我怀疑你可能纯粹是从被注释掉的“记住我”输入中得到的。
对于HTTP内容，当前的选择应该是：
让我试着简化一下，假设该站点的URL是www.example.com，你需要填写用户名和密码注册，因此，我们转到登录页面说现在，并查看它的源代码和搜索的行动网址，它将在形式标签类似
 <form name="loginform" method="post" action="userinfo.php">

我希望有一天这会对某些人有所帮助。网页自动化？绝对是“韦伯”
webbot
甚至可以用于具有动态变化的id和类名的网页，并且具有比selenium或mechanize更多的方法和功能
这里有一个片段：）
这些文档也非常简单明了：
请注意，在某些情况下，您需要使用submit（）。请参阅：我在登录www.pge.com时，使用submit（）works确认了这个问题。Python 3.6有解决方案吗？twill似乎不支持Python 3.5或3.6。我尝试下载它并使用2to3
转换它，但现在我在尝试导入它时得到了一个modulenofounderror
。实际上，我可以通过使用/conversion Twill 1.8.0并安装lxml
和requests
以及pip install
来解决modulenounderror
。但是现在我在尝试导入时遇到了一个SyntaxError
，因为在某个地方False=0
…修复它有点痛苦，但它是有效的：它是否适用于HTTPs网站，或者我必须做类似的事情？在我查看的24个帮助/stackoverflow页面中，这不适用于我尝试的大多数网站。这是我需要的一个网站上唯一有效的解决方案。web自动化的最佳选择是webbot。所有值都始终是用户名和密码吗？我认为这似乎对我选择的网站不起作用。@DylanLogan您必须始终检查实际网页发送到服务器的内容，并使您的脚本适应它。服务器应该无法区分脚本和web浏览器。此示例非常有效。在autocomplete=off
的情况下也可以这样做吗？不在win 64位上安装。错误：找不到满足webbot要求的版本（来自版本：0.0.1.win-amd64）尝试使用python3How在webbot中处理iframe。？我的意思是我必须关闭一个iframe，在页面加载后弹出。。
import requests
url = 'http://example.com/userinfo.php'
values = {'username': 'user',
          'password': 'pass'}

r = requests.post(url, data=values)
print r.content

from webbot import Browser 
web = Browser()
web.go_to('google.com') 
web.click('Sign in')
web.type('mymail@gmail.com' , into='Email')
web.click('NEXT' , tag='span')
web.type('mypassword' , into='Password' , id='passwordFieldId') # specific selection
web.click('NEXT' , tag='span') # you are logged in ^_^