Web scraping 如何通过登录页面使用Biterscript捕获页面？_Web Scraping

Web scraping 如何通过登录页面使用Biterscript捕获页面？

web-scraping

Web scraping 如何通过登录页面使用Biterscript捕获页面？,web-scraping,Web Scraping,我要复制的网站（https）需要用户名和密码才能访问该网站。我尝试了Biterscript，但它只复制登录页面，而不是实际的网页。我不确定是否有办法使用脚本输入用户名和密码 “http://username:password@address”仍然为我提供登录页面，而不是实际的网页给定URL列表，我想将内容复制到文本文件。您使用什么语言首先您必须登录，然后检索结果页面可以使用cURL来实现这一点。这里有一个使用cURL和PHP的简单示例：我认为您需要像这样为IS（Internet会话）使用

我要复制的网站（https）需要用户名和密码才能访问该网站。我尝试了Biterscript，但它只复制登录页面，而不是实际的网页。我不确定是否有办法使用脚本输入用户名和密码

“http://username:password@address”仍然为我提供登录页面，而不是实际的网页

给定URL列表，我想将内容复制到文本文件。

您使用什么语言

首先您必须登录，然后检索结果页面

可以使用cURL来实现这一点。这里有一个使用cURL和PHP的简单示例：

我认为您需要像这样为IS（Internet会话）使用Biterscript命令。（若要发布到页面，需要使用“ispost”命令。若要获取页面，只需使用“cat”命令即可。）

#声明变量
var-str页
#启动名为s的internet会话，用户代理Mozilla。
iStart s“Mozilla/5.0”
#如果网站是https://www.abc.def，连接到站点。
断开s“https://www.abc.def“>$page
#站点的索引页位于变量$page中。
#假设登录表单采用这种格式-
# 
#账户：<
#密码：
# 
# 
#通过将“Login=me&pswd=mypassword&submit=submit”发布到Login.php进行登录。
ispost s“login.php”“login=me&pswd=mypassword&submit=submit”>$page
#“我”和“我的密码”是登录名和密码的值。

登录后的页面现在位于变量$page中。我以为这就是你要找的那一页

希望这有帮助。

什么是“Biterscript”？你想用什么语言做这件事？我没有特别的语言。我正在寻找一些程序，可以访问和复制网页，首先需要通过登录页面。一步一步的说明会很好。

# Declare variables
var str page

# Start internet session, named s, user agent Mozilla.
isstart s "" "Mozilla/5.0"

# If the site is https://www.abc.def, connect to site.
isconnect s "https://www.abc.def" > $page

# The site's index page is in variable $page.
# Suppose the login form is in this format -

# <form name="login" action="login.php" method="post">
# Account: <input type="text" name="login"><
# Password: <input type="password" name="pswd">
# <input type="submit" value="submit">
# </form>

# Login by posting "login=me&pswd=mypassword&submit=submit" to login.php.
ispost s "login.php" "login=me&pswd=mypassword&submit=submit" > $page
# "me" and "mypassword" are values of login and password.