HTTR-POST-in-r令牌

HTTR-POST-in-r令牌,r,authentication,web-scraping,token,httr,R,Authentication,Web Scraping,Token,Httr,我想在这个网站上搜集代理的数据: 我正在使用以下代码,但得到“令牌不匹配错误”: httr::POST( url=”https://thep.hoaphat.com.vn/ajax/load_agency", body=列表( type=“web”, product_id=“7”, 省\ u id=“10”, 成员类型=“1” ), encode=“表单” )->res dat数据被要求 做一点反向工程,我发现请求是从这个JS文件发送的: 看看这个函数: function loadAjax(t

我想在这个网站上搜集代理的数据:

我正在使用以下代码,但得到“令牌不匹配错误”:

httr::POST(
url=”https://thep.hoaphat.com.vn/ajax/load_agency",
body=列表(
type=“web”,
product_id=“7”,
省\ u id=“10”,
成员类型=“1”
),
encode=“表单”
)->res
dat数据被要求

做一点反向工程,我发现请求是从这个JS文件发送的:

看看这个函数:

function loadAjax(t, e, n) {
    var a = {
        beforeSend: function() {},
        success: function() {}
    };
    $.extend(a, n),
    $.ajax({
        headers: {
            "X-CSRF-TOKEN": $('meta[name="csrf-token"]').attr("content")
        },
        type: "POST",
        url: t,
        data: e,
        beforeSend: function() {
            $("#loading_box").css({
                visibility: "visible",
                opacity: 0
            }).animate({
                opacity: 1
            }, 200),
            a.beforeSend()
        },
        success: function(t) {
            $("#loading_box").animate({
                opacity: 0
            }, 200, function() {
                $("#loading_box").css("visibility", "hidden")
            }),
            a.success(t)
        },
        error: function(t) {
            $("#loading_box").animate({
                opacity: 0
            }, 200, function() {
                $("#loading_box").css("visibility", "hidden")
            }),
            alert("Có lỗi xảy ra!")
        }
    })
}
您可以看到令牌来自名为
csrf令牌的
meta
标记。现在,您可以刮取该令牌值并发送请求以获取数据:

library(rvest)
pg <- html_session("https://thep.hoaphat.com.vn/distribution-systems")
token <- read_html(pg) %>%
  html_node(xpath = "//meta[@name='csrf-token']") %>% html_attr("content")
pg <- 
  pg %>% rvest:::request_POST(
    "https://thep.hoaphat.com.vn/ajax/load_agency",
    config = httr::add_headers(`x-csrf-token` = token),
    body = list(
      type = "web",
      product_id = 1, # choose product
      province_id = 50, # choose province
      `member_type[]` = 1, # agency level 1
      `member_type[]` = 2 # agency level 2
    )
  )
data <- httr::content(pg$response)$data
库(rvest)
页面%html\u属性(“内容”)
页面%rvest:::请求发布(
"https://thep.hoaphat.com.vn/ajax/load_agency",
config=httr::add_头(`x-csrf-token`=token),
body=列表(
type=“web”,
product_id=1,选择product
省_id=50,#选择省
`成员类型[]`=1,机构级别1
`成员类型[]`=2机构级别2
)
)

数据页面需要
x-csrf-token
标题和
XSRF-token
cookie值。我认为这是一种保护。是的,我也这么认为,但当我尝试在函数帖子中添加标题时,它仍然不起作用。这是一种保护,不应该每次都是一样的。尝试使用
RSelenium
。谢谢,RSelenium非常棒,但我在使用它时经常出错,错误与java或浏览器的版本有关,。。。。
library(rvest)
pg <- html_session("https://thep.hoaphat.com.vn/distribution-systems")
token <- read_html(pg) %>%
  html_node(xpath = "//meta[@name='csrf-token']") %>% html_attr("content")
pg <- 
  pg %>% rvest:::request_POST(
    "https://thep.hoaphat.com.vn/ajax/load_agency",
    config = httr::add_headers(`x-csrf-token` = token),
    body = list(
      type = "web",
      product_id = 1, # choose product
      province_id = 50, # choose province
      `member_type[]` = 1, # agency level 1
      `member_type[]` = 2 # agency level 2
    )
  )
data <- httr::content(pg$response)$data